SQL Script Not Joining Results - sql

I am having an issue getting this SQL Script to join the two tables together, I can execute it and it will show the results of the first portion (AccountID, AuditDate ..etc) but it will not join to the inner select statement.
I have verified by spot checking that there is data that matches in the database.
What portion did i mess up?
Select AccountID, AuditDate, SourceVersion
From Audit
Left join (
Select A.PrinterAuditID, A.AuditID, A.SerialNr, A.PageCountTotal,a.PageCountColor, A.PageCountMono
from InvalidPrinterAudit A
where A.DeviceID = 90757
) InvalidPrinterAudit on InvalidPrinterAudit.AuditID = Audit.AuditID

There's a couple of things happening here.
One: In order to have fields in your result set, they must be included in the SELECT portion of your query:
Select AccountID, AuditDate, SourceVersion, InvalidPrinterAudit.PrinterAuditID, InvalidPrinterAudit.AuditID, InvalidPrinterAudit.SerialNr, InvalidPrinterAudit.PageCountTotal,InvalidPrinterAudit.PageCountColor, InvalidPrinterAudit.PageCountMono
From Audit
Left join (
Select A.PrinterAuditID, A.AuditID, A.SerialNr, A.PageCountTotal,a.PageCountColor, A.PageCountMono
from InvalidPrinterAudit A
where A.DeviceID = 90757
) InvalidPrinterAudit on InvalidPrinterAudit.AuditID = Audit.AuditID
Two: You don't need to have a subquery here. Subqueries are great if you need to aggregate the results from a seperate table or something, but here you can just go with a LEFT OUTER JOIN and be done with it.
Select AccountID, AuditDate, SourceVersion, InvalidPrinterAudit.PrinterAuditID, InvalidPrinterAudit.AuditID, InvalidPrinterAudit.SerialNr, InvalidPrinterAudit.PageCountTotal,InvalidPrinterAudit.PageCountColor, InvalidPrinterAudit.PageCountMono
From Audit
Left OUTER JOIN InvalidPrinterAudit
ON InvalidPrinterAudit.AuditID = Audit.AuditID
AND InvalidPrinterAudit.DeviceID = 90757
This will apply that DeviceID = 90757 filter to your InvalidPrinterAudit before the join is applied, so you'll still get all of your Audit records, and then only the InvalidPrinterAudit records for that DeviceID that matches.

You are selecting less columns in the outer query than in the inner query.
If you want them to be returned in the resultset, include them in the outer:
Select AccountID, AuditDate, SourceVersion, PrinterAuditID, SerialNr
From Audit
Left join (
Select A.PrinterAuditID, A.AuditID, A.SerialNr, A.PageCountTotal,a.PageCountColor, A.PageCountMono
from InvalidPrinterAudit A
where A.DeviceID = 90757
) InvalidPrinterAudit on InvalidPrinterAudit.AuditID = Audit.AuditID

Related

SQL subquery with join to main query

I have this:
SELECT
SU.FullName as Salesperson,
COUNT(DS.new_dealsheetid) as Units,
SUM(SO.New_profits_sales_totaldealprofit) as TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) as PPU,
-- opportunities subquery
(SELECT COUNT(*) FROM Opportunity O
LEFT JOIN Account A ON O.AccountId = A.AccountId
WHERE A.OwnerId = SU.SystemUserId AND
YEAR(O.CreatedOn) = 2022)
-- /opportunities subquery
FROM New_dealsheet DS
LEFT JOIN SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId
LEFT JOIN New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId
LEFT JOIN SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId
WHERE
YEAR(SO.New_purchaseordersenddate) = 2022 AND
SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName
I'm getting an error from the subquery:
Column 'SystemUser.SystemUserId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Is it possible to use the SystemUser table join from the main query in this way?
As has been mentioned extensively in the comments, the error is actually telling you the problem; SU.SystemUserId isn't in the GROUP BY nor in an aggregate function, and it appears in the SELECT of the query (albeit in the WHERE of a correlated subquery). Any columns in the SELECT must be either aggregated or in the GROUP BY when using one of the other for a query scope. As the column in question isn't aggregated nor in the GROUP BY, the error occurs.
There are, however, other problems. Like mentioned ikn the comments too, your LEFT JOINs make little sense, as many of the tables you LEFT JOIN to require a column in that table to have a non-NULL value; it is impossible for a column to have a non-NULL value if a row was not found.
You also use syntax like YEAR(<Column Name>) = <int Value> in the WHERE; this is not SARGable, and thus should be avoided. Use explicit date boundaries instead.
I assume here that SU.SystemUserId is a primary key, and so should be in the GROUP BY. This is probably a good thing anyway, as a person's full name isn't something that can be used to determine who a person is on their own (take this from someone who shared their name, date of birth and post code with another person in their youth; it caused many problems on the rudimentary IT systems of the time). This results in a query like this:
SELECT SU.FullName AS Salesperson,
COUNT(DS.new_dealsheetid) AS Units,
SUM(SO.New_profits_sales_totaldealprofit) AS TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) AS PPU,
(SELECT COUNT(*)
FROM dbo.Opportunity O
JOIN dbo.Account A ON O.AccountId = A.AccountId --A.OwnerID must have a non-NULL value, so why was this a LEFT JOIN?
WHERE A.OwnerId = SU.SystemUserId
AND O.CreatedOn >= '20220101' --Don't use YEAR(<Column Name>) = <int Value> syntax, it isn't SARGable
AND O.CreatedOn < '20230101') AS SomeColumnAlias
FROM dbo.New_dealsheet DS
JOIN dbo.SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId --SO.New_purchaseordersenddate must have a non-NULL value, so why was this a LEFT JOIN?
JOIN dbo.New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId --SP.New_SalesGroupIdName must have a non-NULL value, so why was this a LEFT JOIN?
LEFT JOIN dbo.SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId --This actually looks like it can be a LEFT JOIN.
WHERE SO.New_purchaseordersenddate >= '20220101' --Don't use YEAR(<Column Name>) = <int Value> syntax, it isn't SARGable
AND SO.New_purchaseordersenddate < '20230101'
AND SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName,
SU.SystemUserId;
Doing such a sub-query is bad on a performance-wise point of view
better do it like this:
SELECT
SU.FullName as Salesperson,
COUNT(DS.new_dealsheetid) as Units,
SUM(SO.New_profits_sales_totaldealprofit) as TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) as PPU,
SUM(csq.cnt) as Count
FROM New_dealsheet DS
LEFT JOIN SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId
LEFT JOIN New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId
LEFT JOIN SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId
-- Moved subquery as sub-join
LEFT JOIN (SELECT a.OwnerId, YEAR(o.CreatedOn) as year, COUNT(*) cnt FROM Opportunity O
LEFT JOIN Account A ON O.AccountId = A.AccountId
GROUP BY a.OwnerId, YEAR(o.CreatedOn) as csq ON csq.OwnerId = su.SystemUserId and csqn.Year = 2022
WHERE
YEAR(SO.New_purchaseordersenddate) = 2022 AND
SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName
So you have a nice join and a clean result
The query above is untested

LEFT JOIN & SUM GROUP BY

EDIT:
The result supposed to be like this:
desired result
I have this query:
SELECT DISTINCT mitarbeiter.mitarbnr, mitarbeiter.login, mitarbeiter.name1, mitarbeiter.name2
FROM vertragspos
left join vertrag_ek_vk_zuord ON vertragspos.id = vertrag_ek_vk_zuord.ek_vertragspos_id
left join mitarbeiter ON vertrag_ek_vk_zuord.anlage_mitarbnr = mitarbeiter.mitarbnr
left join vertragskopf ON vertragskopf.id = vertragspos.vertrag_id
left join
(
SELECT wkurse.*, fremdwaehrung.wsymbol
FROM wkurse
INNER join
(
SELECT lfdnr, Max(tag) AS maxTag
FROM wkurse
WHERE tag < SYSDATE
GROUP BY lfdnr
) t1
ON wkurse.lfdnr = t1.lfdnr AND wkurse.Tag = t1.maxTag
INNER JOIN fremdwaehrung ON wkurse.lfdnr = fremdwaehrung.lfdnr
) wkurse ON vertragskopf.blfdwaehrung = wkurse.lfdnr
left join
(
SELECT vertrag_ID, Sum (preis) preis, Sum (menge) menge, Sum (preis * menge / Decode (vertragskopf.zahlintervall, 1,1,2,2,3,3,4,6,5,12,1) / wkurse.kurs) vertragswert
FROM vertragspos
GROUP BY vertrag_ID
) s ON vertragskopf.id = s.vertrag_id
But I always get an error on line 21 Pos 145:
ORA-00904 WKURSE.KURS invalid identifier
The WKURSE table is supposed be joined already above, but why do I still get error?
How can I do join with all these tables?
I need to join all these tables:
Mitarbeiter, Vertragspos, vertrag_ek_vk_zuord, wkurse, fremdwaehrung, vertragskopf.
What is the right syntax? I'm using SQL Tool 1,8 b38
Thank you.
Because LEFT JOIN is executed on entire dataset, and not in row-by-row manner. So there's no wkurse.kurs available in the execution context of subquery. Since you join that tables, you can place the calculation in the top-most select statement.
EDIT:
After you edited the statement, it became clear where does vertragskopf.zahlintervall came from. But I don't know where are you going to use calculated vertragswert (now it is absent in the query), so I've put it in the result. As I'm not a SQL parser and have no idea of your tables, so I cannot check the code, but calculation now can be resolved (all the values are available in calculation context).
SELECT DISTINCT mitarbeiter.mitarbnr, mitarbeiter.login, mitarbeiter.name1, mitarbeiter.name2, s.amount / Decode (vertragskopf.zahlintervall, 1,1,2,2,3,3,4,6,5,12,1) / wkurse.kurs) vertragswert
FROM vertragspos
left join vertrag_ek_vk_zuord ON vertragspos.id = vertrag_ek_vk_zuord.ek_vertragspos_id
left join mitarbeiter ON vertrag_ek_vk_zuord.anlage_mitarbnr = mitarbeiter.mitarbnr
left join vertragskopf ON vertragskopf.id = vertragspos.vertrag_id
left join (
SELECT wkurse.*, fremdwaehrung.wsymbol
FROM wkurse
INNER join (
SELECT lfdnr, Max(tag) AS maxTag
FROM wkurse
WHERE tag < SYSDATE
GROUP BY lfdnr
) t1
ON wkurse.lfdnr = t1.lfdnr AND wkurse.Tag = t1.maxTag
INNER JOIN fremdwaehrung ON wkurse.lfdnr = fremdwaehrung.lfdnr
) wkurse ON vertragskopf.blfdwaehrung = wkurse.lfdnr
left join (
SELECT vertrag_ID, Sum (preis) preis, Sum (menge) menge, Sum (preis * menge) as amount
FROM vertragspos
GROUP BY vertrag_ID
) s ON vertragskopf.id = s.vertrag_id
Rewriting the code using WITH clause makes it much clearer than select from select.
Also get the rate on last day before today in oracle is as simple as
select wkurse.lfdnr
, max(wkurse.kurs) keep (dense_rank first order by wkurse.tag desc) as rate
from wkurse
where tag < sysdate
group by wkurse.lfdnr
One option is a lateral join:
left join lateral
(SELECT vertrag_ID, Sum(preis) as preis, Sum(menge) as menge,
Sum (preis * menge / Decode (vertragskopf.zahlintervall, 1,1,2,2,3,3,4,6,5,12,1) / wkurse.kurs) vertragswert
FROM vertragspos
GROUP BY vertrag_ID
) s
ON vertragskopf.id = s.vertrag_id

How can I get all the rows from master table and relevent row from the detail table in MS-SQL?

I am using MS-SQL and I am trying to write a query which fetches rows from the master table and related rows from the detail table. Now, the thing I want is that it must only fetch the first row from the master table and related field from the detail tables should be blank in that first row, now if there are related rows found in the detail tables, they must be shown in the separate rows. I have been trying using the following query but it is not giving the desired result.
SELECT
DISTINCT
ProductMaster.ProductMasterID, ProductMaster.ProductName,
ProductMaster.SubCategoryID, SubCategory.SubCategoryName,
ProductBrandAndType.ProductBranAndTypeID, ProductBrandAndType.ProductType,
ProductBrandAndType.Brand, ProductMaster.ProductDesc,
ProductMaster.ReOrderLevel
FROM
ProductBrandAndType
RIGHT OUTER JOIN
Inward
ON ProductBrandAndType.ProductBranAndTypeID = Inward.ProductBrandAndTypeID
RIGHT OUTER JOIN
ProductMaster
ON Inward.ProductID = ProductMaster.ProductMasterID
LEFT OUTER JOIN
SubCategory
ON ProductMaster.SubCategoryID = SubCategory.SubCategoryID
ORDER BY
ProductMaster.ProductName,
ProductBrandAndType.ProductType,
ProductBrandAndType.Brand;
Can anyone help me on this?
Regards
Sikandar
Following query worked.
SELECT dbo.ProductMaster.ProductMasterID, dbo.ProductMaster.ProductName, dbo.ProductMaster.SubCategoryID, dbo.ProductMaster.ProductDesc,
dbo.ProductMaster.ReOrderLevel, null as ProductBrandAndTypeID,
null AS Type, null as Brand
FROM dbo.ProductBrandAndType RIGHT OUTER JOIN
dbo.Inward ON dbo.ProductBrandAndType.ProductBranAndTypeID = dbo.Inward.ProductBrandAndTypeID RIGHT OUTER JOIN
dbo.ProductMaster ON dbo.Inward.ProductID = dbo.ProductMaster.ProductMasterID LEFT OUTER JOIN
dbo.SubCategory ON dbo.ProductMaster.SubCategoryID = dbo.SubCategory.SubCategoryID
UNION
SELECT dbo.ProductMaster.ProductMasterID, dbo.ProductMaster.ProductName, dbo.ProductMaster.SubCategoryID, dbo.ProductMaster.ProductDesc,
dbo.ProductMaster.ReOrderLevel, dbo.ProductBrandAndType.ProductBranAndTypeID,
dbo.ProductBrandAndType.ProductType, dbo.ProductBrandAndType.Brand
FROM dbo.ProductBrandAndType RIGHT OUTER JOIN
dbo.Inward ON dbo.ProductBrandAndType.ProductBranAndTypeID = dbo.Inward.ProductBrandAndTypeID RIGHT OUTER JOIN
dbo.ProductMaster ON dbo.Inward.ProductID = dbo.ProductMaster.ProductMasterID LEFT OUTER JOIN
dbo.SubCategory ON dbo.ProductMaster.SubCategoryID = dbo.SubCategory.SubCategoryID
ORDER BY ProductName;
If you just what the first row from the ProductMaster. Then you can do something like this:
;WITH CTE
(
SELECT
ROW_NUMBER() OVER(ORDER BY ProductMaster.ProductMasterID) AS RowNbr,
ProductMaster.ProductName,
ProductMaster.SubCategoryID,
ProductMaster.ProductDesc,
ProductMaster.ReOrderLevel
FROM
ProductMaster
)
SELECT
ProductMaster.ProductMasterID,
ProductMaster.ProductName,
ProductMaster.SubCategoryID,
SubCategory.SubCategoryName,
ProductBrandAndType.ProductBranAndTypeID,
ProductBrandAndType.ProductType,
ProductBrandAndType.Brand,
ProductMaster.ProductDesc,
ProductMaster.ReOrderLevel
FROM
ProductBrandAndType
RIGHT OUTER JOIN Inward
ON ProductBrandAndType.ProductBranAndTypeID = Inward.ProductBrandAndTypeID
RIGHT OUTER JOIN CTE AS ProductMaster
ON Inward.ProductID = ProductMaster.ProductMasterID
AND RowNbr=1
LEFT OUTER JOIN SubCategory
ON ProductMaster.SubCategoryID = SubCategory.SubCategoryID
ORDER BY
ProductMaster.ProductName,
ProductBrandAndType.ProductType,
ProductBrandAndType.Brand;

Super Slow Query - sped up, but not perfect... Please help

I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.

Max date in view on left outer join

Thanks to a previous question, I found out how to pull the most recent data based on a linked table. BUT, now I have a related question.
The solution that I found used row_number() and PARTITION to pull the most recent set of data. But what if there's a possibility for zero or more rows in a linked table in the view? For example, the table FollowUpDate might have 0 rows, or 1, or more. I just want the most recent FollowUpDate:
SELECT
EFD.FormId
,EFD.StatusName
,MAX(EFD.ActionDate)
,EFT.Name AS FormType
,ECOA.Account AS ChargeOffAccount
,ERC.Name AS ReasonCode
,EAC.Description AS ApprovalCode
,MAX(EFU.FollowUpDate) AS FollowUpDate
FROM (
SELECT EF.FormId, EFD.ActionDate, EFS.Name AS StatusName, EF.FormTypeId, EF.ChargeOffId, EF.ReasonCodeId, EF.ApprovalCodeId,
row_number() OVER ( PARTITION BY EF.FormId ORDER BY EFD.ActionDate DESC ) DateSortKey
FROM Extension.FormDate EFD INNER JOIN Extension.Form EF ON EFD.FormId = EF.FormId INNER JOIN Extension.FormStatus EFS ON EFD.StatusId = EFS.StatusId
) EFD
INNER JOIN Extension.FormType EFT ON EFD.FormTypeId = EFT.FormTypeId
LEFT OUTER JOIN Extension.ChargeOffAccount ECOA ON EFD.ChargeOffId = ECOA.ChargeOffId
LEFT OUTER JOIN Extension.ReasonCode ERC ON EFD.ReasonCodeId = ERC.ReasonCodeId
LEFT OUTER JOIN Extension.ApprovalCode EAC ON EFD.ApprovalCodeId = EAC.ApprovalCodeId
LEFT OUTER JOIN (Select EFU.FormId, EFU.FollowUpDate, row_number() OVER (PARTITION BY EFU.FormId ORDER BY EFU.FollowUpDate DESC) FUDateSortKey FROM Extension.FormFollowUp EFU INNER JOIN Extension.Form EF ON EFU.FormId = EF.FormId) EFU ON EFD.FormId = EFU.FormId
WHERE EFD.DateSortKey = 1
GROUP BY
EFD.FormId, EFD.ActionDate, EFD.StatusName, EFT.Name, ECOA.Account, ERC.Name, EAC.Description, EFU.FollowUpDate
ORDER BY
EFD.FormId
If I do a similar pull using row_number() and PARTITION, I get the data only if there is at least one row in FollowUpDate. Kinda defeats the purpose of a LEFT OUTER JOIN. Can anyone help me get this working?
I rewrote your query - you had unnecessary subselects, and used row_number() for the FUDateSortKey but didn't use the column:
SELECT t.formid,
t.statusname,
MAX(t.actiondate) 'actiondate',
t.formtype,
t.chargeoffaccount,
t.reasoncode,
t.approvalcode,
MAX(t.followupdate) 'followupdate'
FROM (
SELECT t.formid,
fs.name 'StatusName',
t.actiondate,
ft.name 'formtype',
coa.account 'ChargeOffAccount',
rc.name 'ReasonCode',
ac.description 'ApprovalCode',
ffu.followupdate,
row_number() OVER (PARTITION BY ef.formid ORDER BY t.actiondate DESC) 'DateSortKey'
FROM EXTENSION.FORMDATE t
JOIN EXTENSION.FORM ef ON ef.formid = t.formid
JOIN EXTENSION.FORMSTATUS fs ON fs.statusid = t.statusid
JOIN EXTENSION.FORMTYPE ft ON ft.formtypeid = ef.formtypeid
LEFT JOIN EXTENSION.CHARGEOFFACCOUNT coa ON coa.chargeoffid = ef.chargeoffid
LEFT JOIN EXTENSION.REASONCODE rc ON rc.reasoncodeid = ef.reasoncodeid
LEFT JOIN EXTENSION.APPROVALCODE ac ON ac.approvalcodeid = ef.approvalcodeid
LEFT JOIN EXTENSION.FORMFOLLOWUP ffu ON ffu.formid = t.formid) t
WHERE t.datesortkey = 1
GROUP BY t.formid, t.statusname, t.formtype, t.chargeoffaccount, t.reasoncode, t.approvalcode
ORDER BY t.formid
The change I made to allow for FollowUpDate was to use a LEFT JOIN onto the FORMFOLLOWUP table - you were doing an INNER JOIN, so you'd only get rows with FORMFOLLOWUP records associated.
It's pretty hard to guess what's going on without table definitions and sample data.
Also, this is confusing: "the table FollowUpDate might have 0 rows" and you "want the most recent FollowUpDate." (especially when there is no table named FollowUpDate) There is no "most recent FollowUpDate" if there are zero FollowUpDates.
Maybe you want
WHERE <follow up date row number> in (1,NULL)
I figured it out. And as usual, I need a nap. I just needed to change my subselect to something I would swear I'd tried with no success:
SELECT field1, field2
FROM Table1 t1
LEFT JOIN (
SELECT field3, max(dateColumn)
FROM Table2
GROUP BY
field3
) t2
ON (t1.field1 = t2.field3)