Super Slow Query - sped up, but not perfect... Please help - sql

I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte

The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)

Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.

I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.

Related

LEFT JOIN ON COALESCE(a, b, c) - very strange behavior

I have encountered very strange behavior of my query and I wasted a lot of time to understand what causes it, in vane. So I am asking for your help.
SELECT count(*) FROM main_table
LEFT JOIN front_table ON front_table.pk = main_table.fk_front_table
LEFT JOIN info_table ON info_table.pk = front_table.fk_info_table
LEFT JOIN key_table ON key_table.pk = COALESCE(info_table.fk_key_table, front_table.fk_key_table_1, front_table.fk_key_table_2)
LEFT JOIN side_table ON side_table.fk_front_table = front_table.pk
WHERE side_table.pk = (SELECT MAX(pk) FROM side_table WHERE fk_front_table = front_table.pk)
OR side_table.pk IS NULL
Seems like a simple join query, with coalesce, I've used this technique before(not too many times) and it worked right.
In this query I don't ever get nulls for side_table.pk. If I remove coalesce or just don't use key_table, then the query returns rows with many null side_table.pk, but if I add coalesce join, I can't get those nulls.
It seems key_table and side_table don't have anything in common, but the result is so weird.
Also, when I don't use side_table and WHERE clause, the count(*) result with coalesce and without differs, but I can't see any pattern in rows missing, it seems random!
Real query:
SELECT ECHANGE.EXC_AUTO_KEY, STOCK_RESERVATIONS.STR_AUTO_KEY FROM EXCHANGE
LEFT JOIN WO_BOM ON WO_BOM.WOB_AUTO_KEY = EXCHANGE.WOB_AUTO_KEY
LEFT JOIN VIEW_WO_SUB ON VIEW_WO_SUB.WOO_AUTO_KEY = WO_BOM.WOO_AUTO_KEY
LEFT JOIN STOCK stock3 ON stock3.STM_AUTO_KEY = EXCHANGE.STM_AUTO_KEY
LEFT JOIN STOCK stock2 ON stock2.STM_AUTO_KEY = EXCHANGE.ORIG_STM
LEFT JOIN CONSIGNMENT_CODES con2 ON con2.CNC_AUTO_KEY = stock2.CNC_AUTO_KEY
LEFT JOIN CONSIGNMENT_CODES con3 ON con3.CNC_AUTO_KEY = stock3.CNC_AUTO_KEY
LEFT JOIN CI_UTL ON CI_UTL.CUT_AUTO_KEY = EXCHANGE.CUT_AUTO_KEY
LEFT JOIN PART_CONDITION_CODES pcc2 ON pcc2.PCC_AUTO_KEY = stock2.PCC_AUTO_KEY
LEFT JOIN PART_CONDITION_CODES pcc3 ON pcc3.PCC_AUTO_KEY = stock3.PCC_AUTO_KEY
LEFT JOIN STOCK_RESERVATIONS ON STOCK_RESERVATIONS.STM_AUTO_KEY = stock3.STM_AUTO_KEY
LEFT JOIN WAREHOUSE wh2 ON wh2.WHS_AUTO_KEY = stock2.WHS_ORIGINAL
LEFT JOIN SM_HISTORY ON (SM_HISTORY.STM_AUTO_KEY = EXCHANGE.ORIG_STM AND SM_HISTORY.WOB_REF = EXCHANGE.WOB_AUTO_KEY)
LEFT JOIN RC_DETAIL ON stock3.RCD_AUTO_KEY = RC_DETAIL.RCD_AUTO_KEY
LEFT JOIN RC_HEADER ON RC_HEADER.RCH_AUTO_KEY = RC_DETAIL.RCH_AUTO_KEY
LEFT JOIN WAREHOUSE wh3 ON wh3.WHS_AUTO_KEY = COALESCE(RC_DETAIL.WHS_AUTO_KEY, stock3.WHS_ORIGINAL, stock3.WHS_AUTO_KEY)
WHERE STOCK_RESERVATIONS.STR_AUTO_KEY = (SELECT MAX(STR_AUTO_KEY) FROM STOCK_RESERVATIONS WHERE STM_AUTO_KEY = stock3.STM_AUTO_KEY)
OR STOCK_RESERVATIONS.STR_AUTO_KEY IS NULL
Removing LEFT JOIN WAREHOUSE wh3 gives me about unique EXC_AUTO_KEY values with a lot of NULL STR_AUTO_KEY, while leaving this row removes all NULL STR_AUTO_KEY.
I recreated simple tables with numbers with the same structure and query works without any problems o.0
I have a feeling COALESCE is acting as a REQUIRED flag for the joined table, hence shooting the LEFT JOIN to become an INNER JOIN.
Try this:
SELECT COUNT(*)
FROM main_table
LEFT JOIN front_table ON front_table.pk = main_table.fk_front_table
LEFT JOIN info_table ON info_table.pk = front_table.fk_info_table
LEFT JOIN key_table ON key_table.pk = NVL(info_table.fk_key_table, NVL(front_table.fk_key_table_1, front_table.fk_key_table_2))
LEFT JOIN (SELECT fk_, MAX(pk) as pk FROM side_table GROUP BY fk_) st ON st.fk_ = front_table.pk
NVL might behave just the same though...
I undertood what was the problem (not entirely though): there is a LEFT JOIN VIEW_WO_SUB in original query, 3rd line. It causes this query to act in a weird way.
When I replaced the view with the other table which contained the information I needed, the query started returning right results.
Basically, with this view join, NVL, COALESCE or CASE join with combination of certain arguments did not work along with OR clause in WHERE subquery, all rest was fine. ALthough, I could get the query to work with this view join, by changing the order of joined tables, I had to place table participating in where subquery to the bottom.

Recursive query with outer joins?

I'm attempting the following query,
DECLARE #EntityType varchar(25)
SET #EntityType = 'Accessory';
WITH Entities (
E_ID, E_Type,
P_ID, P_Name, P_DataType, P_Required, P_OnlyOne,
PV_ID, PV_Value, PV_EntityID, PV_ValueEntityID,
PV_UnitValueID, PV_UnitID, PV_UnitName, PV_UnitDesc, PV_MeasureID, PV_MeasureName, PV_UnitValue,
PV_SelectionID, PV_DropDownID, PV_DropDownName, PV_DropDownOptionID, PV_DropDownOptionName, PV_DropDownOptionDesc,
RecursiveLevel
)
AS
(
-- Original Query
SELECT dbo.Entity.ID AS E_ID, dbo.EntityType.Name AS E_Type,
dbo.Property.ID AS P_ID, dbo.Property.Name AS P_Name, DataType.Name AS P_DataType, Required AS P_Required, OnlyOne AS P_OnlyOne,
dbo.PropertyValue.ID AS PV_ID, dbo.PropertyValue.Value AS PV_Value, dbo.PropertyValue.EntityID AS PV_EntityID, dbo.PropertyValue.ValueEntityID AS PV_ValueEntityID,
dbo.UnitValue.ID AS PV_UnitValueID, dbo.UnitOfMeasure.ID AS PV_UnitID, dbo.UnitOfMeasure.Name AS PV_UnitName, dbo.UnitOfMeasure.Description AS PV_UnitDesc, dbo.Measure.ID AS PV_MeasureID, dbo.Measure.Name AS PV_MeasureName, dbo.UnitValue.UnitValue AS PV_UnitValue,
dbo.DropDownSelection.ID AS PV_SelectionID, dbo.DropDown.ID AS PV_DropDownID, dbo.DropDown.Name AS PV_DropDownName, dbo.DropDownOption.ID AS PV_DropDownOptionID, dbo.DropDownOption.Name AS PV_DropDownOptionName, dbo.DropDownOption.Description AS PV_DropDownOptionDesc,
0 AS RecursiveLevel
FROM dbo.Entity
INNER JOIN dbo.EntityType ON dbo.EntityType.ID = dbo.Entity.TypeID
INNER JOIN dbo.Property ON dbo.Property.EntityTypeID = dbo.Entity.TypeID
INNER JOIN dbo.PropertyValue ON dbo.Property.ID = dbo.PropertyValue.PropertyID AND dbo.PropertyValue.EntityID = dbo.Entity.ID
INNER JOIN dbo.DataType ON dbo.DataType.ID = dbo.Property.DataTypeID
LEFT JOIN dbo.UnitValue ON dbo.UnitValue.ID = dbo.PropertyValue.UnitValueID
LEFT JOIN dbo.UnitOfMeasure ON dbo.UnitOfMeasure.ID = dbo.UnitValue.UnitOfMeasureID
LEFT JOIN dbo.Measure ON dbo.Measure.ID = dbo.UnitOfMeasure.MeasureID
LEFT JOIN dbo.DropDownSelection ON dbo.DropDownSelection.ID = dbo.PropertyValue.DropDownSelectedID
LEFT JOIN dbo.DropDownOption ON dbo.DropDownOption.ID = dbo.DropDownSelection.SelectedOptionID
LEFT JOIN dbo.DropDown ON dbo.DropDown.ID = dbo.DropDownSelection.DropDownID
WHERE dbo.EntityType.Name = #EntityType
UNION ALL
-- Recursive Query?
SELECT E2.E_ID AS E_ID, dbo.EntityType.Name AS E_Type,
dbo.Property.ID AS P_ID, dbo.Property.Name AS P_Name, DataType.Name AS P_DataType, Required AS P_Required, OnlyOne AS P_OnlyOne,
dbo.PropertyValue.ID AS PV_ID, dbo.PropertyValue.Value AS PV_Value, dbo.PropertyValue.EntityID AS PV_EntityID, dbo.PropertyValue.ValueEntityID AS PV_ValueEntityID,
dbo.UnitValue.ID AS PV_UnitValueID, dbo.UnitOfMeasure.ID AS PV_UnitID, dbo.UnitOfMeasure.Name AS PV_UnitName, dbo.UnitOfMeasure.Description AS PV_UnitDesc, dbo.Measure.ID AS PV_MeasureID, dbo.Measure.Name AS PV_MeasureName, dbo.UnitValue.UnitValue AS PV_UnitValue,
dbo.DropDownSelection.ID AS PV_SelectionID, dbo.DropDown.ID AS PV_DropDownID, dbo.DropDown.Name AS PV_DropDownName, dbo.DropDownOption.ID AS PV_DropDownOptionID, dbo.DropDownOption.Name AS PV_DropDownOptionName, dbo.DropDownOption.Description AS PV_DropDownOptionDesc,
(RecursiveLevel + 1)
FROM Entities AS E2
INNER JOIN dbo.Entity ON dbo.Entity.ID = E2.PV_ValueEntityID
INNER JOIN dbo.EntityType ON dbo.EntityType.ID = dbo.Entity.TypeID
INNER JOIN dbo.Property ON dbo.Property.EntityTypeID = dbo.Entity.TypeID
INNER JOIN dbo.PropertyValue ON dbo.Property.ID = dbo.PropertyValue.PropertyID AND dbo.PropertyValue.EntityID = E2.E_ID
INNER JOIN dbo.DataType ON dbo.DataType.ID = dbo.Property.DataTypeID
INNER JOIN dbo.UnitValue ON dbo.UnitValue.ID = dbo.PropertyValue.UnitValueID
INNER JOIN dbo.UnitOfMeasure ON dbo.UnitOfMeasure.ID = dbo.UnitValue.UnitOfMeasureID
INNER JOIN dbo.Measure ON dbo.Measure.ID = dbo.UnitOfMeasure.MeasureID
INNER JOIN dbo.DropDownSelection ON dbo.DropDownSelection.ID = dbo.PropertyValue.DropDownSelectedID
INNER JOIN dbo.DropDownOption ON dbo.DropDownOption.ID = dbo.DropDownSelection.SelectedOptionID
INNER JOIN dbo.DropDown ON dbo.DropDown.ID = dbo.DropDownSelection.DropDownID
)
SELECT E_ID, E_Type,
P_ID, P_Name, P_DataType, P_Required, P_OnlyOne,
PV_ID, PV_Value, PV_EntityID, PV_ValueEntityID,
PV_UnitValueID, PV_UnitID, PV_UnitName, PV_UnitDesc, PV_MeasureID, PV_MeasureName, PV_UnitValue,
PV_SelectionID, PV_DropDownID, PV_DropDownName, PV_DropDownOptionID, PV_DropDownOptionName, PV_DropDownOptionDesc,
RecursiveLevel
FROM Entities
INNER JOIN [dbo].[Entity] AS dE
ON dE.ID = PV_EntityID
The problem is the second query, the "recursive one" is getting the data I expect since I can't do the LEFT JOINs like in the first query. (At least to my understanding).
If I remove the fetching of the data that requires the LEFT (Outer) JOINs then the recursion works perfectly. My problem is I need both. Is there a way I can accomplish this?
Per http://msdn.microsoft.com/en-us/library/ms175972.aspx you can not have a left/right/outer join in a recursive CTE.
For a recursive CTE you can't use a subquery either so I sugest following this example.
They use two CTE's. The first is not recursive and does the left join to get the data it needs. The second CTE is recursive and inner joins on the first CTE. Since CTE1 is not recursive it can left join and supply default values for the missing rows and is guarenteed to work in the inner join.
However, you can also duplicate a left join with a union and subselect though it isn't really useful normally but it is interesting.
In that case, you would keep your first statement how it is. It will match all rows that join successfully.
Then UNION that query with another query that removes the join, but has a
NOT EXISTS(SELECT 1 FROM MISSING_ROWS_TABLE WHERE MAIN_TABLE.JOIN_CONDITION = MISSING_ROWS_TABLE.JOIN_CONDITION)
This gets all the rows that failed the previous join condition in query 1. You can replace the colmuns you would get from MISSING_ROWS_TABLE with NULL. I had to do this once using a coding framework that didn't support outer joins. Since recursive CTE's don't allow subqueries you have to use the first solution.

Ambiguous outer join in MS Access

Trying to create an outer join on two other joined tables when recieving this error - I just dont see how to create two separate queries to make it work. Subqueries don't seem to work either, any help appreciated. I get errors for the below query, thanks.
SELECT
CardHeader.CardID, CardHeader.CardDescription, CardHeader.GloveSize,
CardHeader.GloveDescription, CardDetail.Bin, CardDetail.ItemID, Items.ItemDescription,
Items.VCatalogID, CardDetail.ChargeCode, CardDetail.Quantity, Items.Cost, CardColors.ColorID
FROM
((Items
INNER JOIN
(CardHeader INNER JOIN CardDetail ON CardHeader.CardID = CardDetail.CardID) ON Items.ItemID = CardDetail.ItemID)
LEFT JOIN
CardColors ON CardDetail.ItemID = CardColors.ItemID)
INNER JOIN
Colors ON CardColors.ColorID = Colors.ID
ORDER BY
CardHeader.CardID;
I tried the following which runs but asks for the following parameters (which it shouldnt)
CardHeader.ID, MainQry.CardID
SELECT
MainQry.ID, MainQry.CardDescription, MainQry.GloveSize,
MainQry.GloveDescription, MainQry.Bin, MainQry.ItemID,
MainQry.ItemDescription, MainQry.VCatalogID, MainQry.ChargeCode,
MainQry.Quantity, MainQry.Cost, SubQry.ColorID
FROM
(SELECT
CardHeader.ID, CardHeader.CardDescription, CardHeader.GloveSize,
CardHeader.GloveDescription, CardDetail.Bin,
CardDetail.ItemID, Items.ItemDescription, Items.VCatalogID,
CardDetail.ChargeCode, CardDetail.Quantity, Items.Cost
FROM
Items
INNER JOIN
(CardHeader
INNER JOIN
CardDetail ON CardHeader.CardID = CardDetail.CardID) ON Items.ItemID = CardDetail.ItemID
) AS MainQry
LEFT JOIN
(SELECT
CardColors.ItemID, CardColors.ColorID
FROM
CardColors
INNER JOIN
Colors ON CardColors.ColorID = Colors.ID) AS SubQry ON MainQry.ItemID = SubQry.ItemID
ORDER BY
MainQry.CardID;
The second SQL statement can be corrected by reference to the first statement and the error. The error is that both CardHeader.ID and MainQry.CardID are prompting for a parameter, which indicates that the inner statement should include CardHeader.CardID, rather than CardHeader.ID

How can I speed up this view which sums a load of quote line rows?

I am writing a view to show quote totals based on summing the values in a quote line table. I need to restrict the view to only show quotes for customers of a particular 'pricetype'. However when I do this the view slows down a lot.
SQL to sum the prices
SELECT dbo.quoteline.qid, SUM((dbo.pricelist.listprice - dbo.quoteline.voff) * dbo.quoteline.quantity) AS total
FROM dbo.quoteline LEFT OUTER JOIN dbo.pricelist ON dbo.quoteline.prodcode = dbo.pricelist.prodcode GROUP BY dbo.quoteline.qid
SQL once 'pricetype' constraint is added
SELECT dbo.quoteline.qid, SUM((dbo.pricelist.listprice - dbo.quoteline.voff) * dbo.quoteline.quantity) AS total
FROM dbo.pricelist RIGHT OUTER JOIN
dbo.client RIGHT OUTER JOIN
dbo.quote ON dbo.client.cid = dbo.quote.cid RIGHT OUTER JOIN
dbo.quoteline ON dbo.quote.qid = dbo.quoteline.qid ON dbo.pricelist.prodcode = dbo.quoteline.prodcode
WHERE (dbo.client.pricetype = 'V')
GROUP BY dbo.quoteline.qid
Maybe its late and I am having a moment but any help here would be gratefully appreciated.
Two things: First, can you put an index on the dbo.client.pricetype column without it interfering with inserts/updates? Secondly, inner joins are generally faster than outer joins and since your results and where clause depend on the other tables, I suspect you will want to do inner joins anyways unless there are NULL records you need back from your view. Try this following query to see if it gets you the results you need:
SELECT dbo.quoteline.qid, SUM((dbo.pricelist.listprice - dbo.quoteline.voff) * dbo.quoteline.quantity) AS total
FROM dbo.quoteline
INNER JOIN dbo.pricelist ON dbo.quoteline.prodcode = dbo.pricelist.prodcode
INNER JOIN dbo.quote ON dbo.quote.qid = dbo.quoteline.qid
INNER JOIN dbo.client ON dbo.client.cid = dbo.quote.cid
WHERE (dbo.client.pricetype = 'V')
GROUP BY dbo.quoteline.qid
What does happen if you do it like this :
SELECT dbo.quoteline.qid, SUM((dbo.pricelist.listprice - dbo.quoteline.voff) * dbo.quoteline.quantity) AS total
FROM dbo.pricelist RIGHT OUTER JOIN
dbo.client ON dbo.client.pricetype = 'V' RIGHT OUTER JOIN
dbo.quote ON dbo.client.cid = dbo.quote.cid RIGHT OUTER JOIN
dbo.quoteline ON dbo.quote.qid = dbo.quoteline.qid AND dbo.pricelist.prodcode = dbo.quoteline.prodcode
GROUP BY dbo.quoteline.qid

Top 1 on Left Join SubQuery

I am trying to take a person and display their current insurance along with their former insurance. I guess one could say that I'm trying to flaten my view of customers or people. I'm running into an issue where I'm getting multiple records back due to multiple records existing within my left join subqueries. I had hoped I could solve this by adding "TOP 1" to the subquery, but that actually returns nothing...
Any ideas?
SELECT
p.person_id AS 'MIRID'
, p.firstname AS 'FIRST'
, p.lastname AS 'LAST'
, pg.name AS 'GROUP'
, e.name AS 'AOR'
, p.leaddate AS 'CONTACT DATE'
, [dbo].[GetPICampaignDisp](p.person_id, '2009') AS 'PI - 2009'
, [dbo].[GetPICampaignDisp](p.person_id, '2008') AS 'PI - 2008'
, [dbo].[GetPICampaignDisp](p.person_id, '2007') AS 'PI - 2007'
, a_disp.name AS 'CURR DISP'
, a_ins.name AS 'CURR INS'
, a_prodtype.name AS 'CURR INS TYPE'
, a_t.date AS 'CURR INS APP DATE'
, a_t.effdate AS 'CURR INS EFF DATE'
, b_disp.name AS 'PREV DISP'
, b_ins.name AS 'PREV INS'
, b_prodtype.name AS 'PREV INS TYPE'
, b_t.date AS 'PREV INS APP DATE'
, b_t.effdate AS 'PREV INS EFF DATE'
, b_t.termdate AS 'PREV INS TERM DATE'
FROM
[person] p
LEFT OUTER JOIN
[employee] e
ON
e.employee_id = p.agentofrecord_id
INNER JOIN
[dbo].[person_physician] pp
ON
p.person_id = pp.person_id
INNER JOIN
[dbo].[physician] ph
ON
ph.physician_id = pp.physician_id
INNER JOIN
[dbo].[clinic] c
ON
c.clinic_id = ph.clinic_id
INNER JOIN
[dbo].[d_Physgroup] pg
ON
pg.d_physgroup_id = c.physgroup_id
LEFT OUTER JOIN
(
SELECT
tr1.*
FROM
[transaction] tr1
LEFT OUTER JOIN
[d_vendor] ins1
ON
ins1.d_vendor_id = tr1.d_vendor_id
LEFT OUTER JOIN
[d_product_type] prodtype1
ON
prodtype1.d_product_type_id = tr1.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] ctype1
ON
ctype1.d_commission_type_id = tr1.d_commission_type_id
WHERE
prodtype1.name <> 'Medicare Part D'
AND tr1.termdate IS NULL
) AS a_t
ON
a_t.person_id = p.person_id
LEFT OUTER JOIN
[d_vendor] a_ins
ON
a_ins.d_vendor_id = a_t.d_vendor_id
LEFT OUTER JOIN
[d_product_type] a_prodtype
ON
a_prodtype.d_product_type_id = a_t.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] a_ctype
ON
a_ctype.d_commission_type_id = a_t.d_commission_type_id
LEFT OUTER JOIN
[d_disposition] a_disp
ON
a_disp.d_disposition_id = a_t.d_disposition_id
LEFT OUTER JOIN
(
SELECT
tr2.*
FROM
[transaction] tr2
LEFT OUTER JOIN
[d_vendor] ins2
ON
ins2.d_vendor_id = tr2.d_vendor_id
LEFT OUTER JOIN
[d_product_type] prodtype2
ON
prodtype2.d_product_type_id = tr2.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] ctype2
ON
ctype2.d_commission_type_id = tr2.d_commission_type_id
WHERE
prodtype2.name <> 'Medicare Part D'
AND tr2.termdate IS NOT NULL
) AS b_t
ON
b_t.person_id = p.person_id
LEFT OUTER JOIN
[d_vendor] b_ins
ON
b_ins.d_vendor_id = b_t.d_vendor_id
LEFT OUTER JOIN
[d_product_type] b_prodtype
ON
b_prodtype.d_product_type_id = b_t.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] b_ctype
ON
b_ctype.d_commission_type_id = b_t.d_commission_type_id
LEFT OUTER JOIN
[d_disposition] b_disp
ON
b_disp.d_disposition_id = b_t.d_disposition_id
WHERE
pg.d_physgroup_id = #PhysGroupID
In Sql server 2005 you can use OUTER APPLY
SELECT p.person_id, s.e.employee_id
FROM person p
OUTER APPLY (SELECT TOP 1 *
FROM Employee
WHERE /*JOINCONDITION*/
ORDER BY /*Something*/ DESC) s
http://technet.microsoft.com/en-us/library/ms175156.aspx
The pattern I normally use for this is:
SELECT whatever
FROM person
LEFT JOIN subtable AS s1
ON s1.personid = person.personid
...
WHERE NOT EXISTS
( SELECT 1 FROM subtable
WHERE personid = person.personid
AND orderbydate > s1.orderbydate
)
Which avoids the TOP 1 clause and maybe makes it a little clearer.
BTW, I like the way you've put this query together in general, except I'd leave out the brackets, assuming you have rationally named tables and columns; and you might even gain some performance (but at least elegance) by listing columns for tr1 and tr2, rather than "tr1.*" and "tr2.*".
Thanks for all of the feedback and ideas...
In the simplest of terms, I have a person table that stores contact information like name, email, etc. I have another table that stores transactions. Each transaction is really an insurance policy that would contain information on the provider, product type, product name, etc.
I want to avoid giving the user duplicate person records since this causes them to look for the duplicates prior to running mail merges, etc. I'm getting duplicates when there is more than 1 transaction that has not been terminated, and when there is more than 1 transaction that has been terminated.
Someone else suggested that I consider a cursor to grab my distinct contact records and then perform the sub selects to get the current and previous insurance information. I don't know if I want to head down that path though.
It's difficult to understand your question so first I'll throw this out there: does changing your SELECT to SELECT DISTINCT do what you want?
Otherwise, let me get this straight, you're trying to get your customers' current insurance and previous insurance, but they may actually have many insurances before that, recorded in the [transactions] table? I looked at your SQL for quite a few minutes but can't figure out what it all means, so could you please reduce it down to only the parts that are necessary? Then I'll think about it some more. It sounds to me like you need a GROUP BY somehow, but I can't work it out exactly yet.
Couldn't take the time to dig through all your SQL (what a beast!), here's an idea that might make things easier to handle:
select
p.person_id, p.name <and other person columns>,
(select <current policy columns>
from pol <and other tables for policy>
where pol.<columns for join> = p.person_id
and <restrictions for current policy>),
(select <previous policy columns>
from pol <and other tables for policy>
where pol.<columns for join> = p.person_id
and <restrictions for previouspolicy>),
<other columns>
from person p <and "directly related" tables>
This makes the statement easier to read by separating the different parts into their own subselects, and it also makes it easier to add a "Top 1" in without affecting the rest of the statement. Hope that helps.