Max date in view on left outer join - sql

Thanks to a previous question, I found out how to pull the most recent data based on a linked table. BUT, now I have a related question.
The solution that I found used row_number() and PARTITION to pull the most recent set of data. But what if there's a possibility for zero or more rows in a linked table in the view? For example, the table FollowUpDate might have 0 rows, or 1, or more. I just want the most recent FollowUpDate:
SELECT
EFD.FormId
,EFD.StatusName
,MAX(EFD.ActionDate)
,EFT.Name AS FormType
,ECOA.Account AS ChargeOffAccount
,ERC.Name AS ReasonCode
,EAC.Description AS ApprovalCode
,MAX(EFU.FollowUpDate) AS FollowUpDate
FROM (
SELECT EF.FormId, EFD.ActionDate, EFS.Name AS StatusName, EF.FormTypeId, EF.ChargeOffId, EF.ReasonCodeId, EF.ApprovalCodeId,
row_number() OVER ( PARTITION BY EF.FormId ORDER BY EFD.ActionDate DESC ) DateSortKey
FROM Extension.FormDate EFD INNER JOIN Extension.Form EF ON EFD.FormId = EF.FormId INNER JOIN Extension.FormStatus EFS ON EFD.StatusId = EFS.StatusId
) EFD
INNER JOIN Extension.FormType EFT ON EFD.FormTypeId = EFT.FormTypeId
LEFT OUTER JOIN Extension.ChargeOffAccount ECOA ON EFD.ChargeOffId = ECOA.ChargeOffId
LEFT OUTER JOIN Extension.ReasonCode ERC ON EFD.ReasonCodeId = ERC.ReasonCodeId
LEFT OUTER JOIN Extension.ApprovalCode EAC ON EFD.ApprovalCodeId = EAC.ApprovalCodeId
LEFT OUTER JOIN (Select EFU.FormId, EFU.FollowUpDate, row_number() OVER (PARTITION BY EFU.FormId ORDER BY EFU.FollowUpDate DESC) FUDateSortKey FROM Extension.FormFollowUp EFU INNER JOIN Extension.Form EF ON EFU.FormId = EF.FormId) EFU ON EFD.FormId = EFU.FormId
WHERE EFD.DateSortKey = 1
GROUP BY
EFD.FormId, EFD.ActionDate, EFD.StatusName, EFT.Name, ECOA.Account, ERC.Name, EAC.Description, EFU.FollowUpDate
ORDER BY
EFD.FormId
If I do a similar pull using row_number() and PARTITION, I get the data only if there is at least one row in FollowUpDate. Kinda defeats the purpose of a LEFT OUTER JOIN. Can anyone help me get this working?

I rewrote your query - you had unnecessary subselects, and used row_number() for the FUDateSortKey but didn't use the column:
SELECT t.formid,
t.statusname,
MAX(t.actiondate) 'actiondate',
t.formtype,
t.chargeoffaccount,
t.reasoncode,
t.approvalcode,
MAX(t.followupdate) 'followupdate'
FROM (
SELECT t.formid,
fs.name 'StatusName',
t.actiondate,
ft.name 'formtype',
coa.account 'ChargeOffAccount',
rc.name 'ReasonCode',
ac.description 'ApprovalCode',
ffu.followupdate,
row_number() OVER (PARTITION BY ef.formid ORDER BY t.actiondate DESC) 'DateSortKey'
FROM EXTENSION.FORMDATE t
JOIN EXTENSION.FORM ef ON ef.formid = t.formid
JOIN EXTENSION.FORMSTATUS fs ON fs.statusid = t.statusid
JOIN EXTENSION.FORMTYPE ft ON ft.formtypeid = ef.formtypeid
LEFT JOIN EXTENSION.CHARGEOFFACCOUNT coa ON coa.chargeoffid = ef.chargeoffid
LEFT JOIN EXTENSION.REASONCODE rc ON rc.reasoncodeid = ef.reasoncodeid
LEFT JOIN EXTENSION.APPROVALCODE ac ON ac.approvalcodeid = ef.approvalcodeid
LEFT JOIN EXTENSION.FORMFOLLOWUP ffu ON ffu.formid = t.formid) t
WHERE t.datesortkey = 1
GROUP BY t.formid, t.statusname, t.formtype, t.chargeoffaccount, t.reasoncode, t.approvalcode
ORDER BY t.formid
The change I made to allow for FollowUpDate was to use a LEFT JOIN onto the FORMFOLLOWUP table - you were doing an INNER JOIN, so you'd only get rows with FORMFOLLOWUP records associated.

It's pretty hard to guess what's going on without table definitions and sample data.
Also, this is confusing: "the table FollowUpDate might have 0 rows" and you "want the most recent FollowUpDate." (especially when there is no table named FollowUpDate) There is no "most recent FollowUpDate" if there are zero FollowUpDates.
Maybe you want
WHERE <follow up date row number> in (1,NULL)

I figured it out. And as usual, I need a nap. I just needed to change my subselect to something I would swear I'd tried with no success:
SELECT field1, field2
FROM Table1 t1
LEFT JOIN (
SELECT field3, max(dateColumn)
FROM Table2
GROUP BY
field3
) t2
ON (t1.field1 = t2.field3)

Related

LEFT JOIN not keeping only records that occur in a SELECT query

I have the following SQL select statement that I use to get a subset of products, or wines:
SELECT pv.SkProdVariantId AS id,
pa.Colour AS colour,
FROM Dim.ProductVariant AS pv
JOIN ProductAttributes_new AS pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
The length of this table generated is 3,905. I want to get all the transactional data for these products.
At the moment I'm using this select statement
SELECT c.CalDate AS timestamp,
f.SkProductVariantId AS sku_id,
f.Quantity AS quantity
FROM fact.FTransactions AS f
LEFT JOIN Dim.Calendar AS c
ON f.SkDateId = c.SkDateId
LEFT JOIN (
SELECT pv.SkProdVariantId AS id,
pa.Colour AS colour,
FROM Dim.ProductVariant AS pv
JOIN ProductAttributes_new AS pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
) AS s
ON s.id = f.SkProductVariantId
WHERE c.CalDate LIKE '%2019%'
The calendar dates are correct, but the number of unique products returned is 5,648, rather than the expected 3,905 from the select query.
Why does my LEFT JOIN on the first select query not work as I expect it to, please?
Thanks for any help!
If you want all the rows form your query, it needs to be the first reference in the LEFT JOIN. Then, I am guessing that you want transaction in 2019:
select . . .
from (SELECT pv.SkProdVariantId AS id, pa.Colour AS colour,
FROM Dim.ProductVariant pv JOIN
ProductAttributes_new pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
) s LEFT JOIN
(fact.FTransactions f JOIN
Dim.Calendar c
ON f.SkDateId = c.SkDateId AND
c.CalDate >= '2019-01-01' AND
c.CalDate < '2020-01-01'
)
ON s.id = f.SkProductVariantId;
Note that this assumes that CalDate is really a date and not a string. LIKE should only be used on strings.
You misunderstand somehow how outer joins work. See Gordon's answer and my request comment on that.
As to the task: It seems you want to select transactions of 2019, but you want to restrict your results to wine products. We typically restrict query results in the WHERE clause. You can use IN or EXISTS for that.
SELECT
c.CalDate AS timestamp,
f.SkProductVariantId AS sku_id,
f.Quantity AS quantity
FROM fact.FTransactions AS f
INNER JOIN Dim.Calendar AS c ON f.SkDateId = c.SkDateId
WHERE DATEPART(YEAR, c.CalDate) = 2019
AND f.SkProductVariantId IN
(
SELECT pv.SkProdVariantId
FROM Dim.ProductVariant AS pv
WHERE pv.ProdTypeName = 'Wines'
);
(I've removed the join to ProductAttributes_new, because it doesn't seem to play any part in this query.)

SELECT statement where rows are omitted based on another table

Table with orders has another table with positions. I want the orders table to show but then only have the most up to-date position on it. Below is a picture of the 3 rows I want showing. Omit the rest.
SELECT DispatchTable.ordernumber, DispatchTable.truck,
DispatchTable.driver, DispatchTable.actualpickup,
DispatchTable.actualdropoff, orders.pickupdateandtime,
orders.dropoffdateandtime, Truck002.lastposition,
Truck002.lastdateandtime
FROM DispatchTable
INNER JOIN orders ON DispatchTable.ordernumber = orders.id
INNER JOIN Truck002 ON DispatchTable.truck = Truck002.name
WHERE (orders.status = 'onRoute')
Assuming that you want the row having the latest lastdateandtime for the truck name, this should work:
SELECT DispatchTable.ordernumber,
DispatchTable.truck,
DispatchTable.driver,
DispatchTable.actualpickup,
DispatchTable.actualdropoff,
orders.pickupdateandtime,
orders.dropoffdateandtime,
TruckLatest.lastposition,
TruckLatest.lastdateandtime
FROM DispatchTable
INNER JOIN orders ON DispatchTable.ordernumber = orders.id
INNER JOIN (SELECT name,
lastposition,
lastdateandtime
FROM Truck002 Truck1
WHERE lastdateandtime =
(SELECT MAX(lastdateandtime)
FROM Truck002 Truck2
WHERE Truck2.name = Truck1.name)) TruckLatest
ON DispatchTable.truck = TruckLatest.name
WHERE (orders.status = 'onRoute')
If I understand correctly, you can get the most recent record for a truck using ROW_NUMBER():
SELECT dt.ordernumber, dt.truck,
dt.driver, dt.actualpickup,
dt.actualdropoff, o.pickupdateandtime,
o.dropoffdateandtime, t.lastposition,
t.lastdateandtime
FROM DispatchTable dt INNER JOIN
orders o
ON dt.ordernumber = o.id INNER JOIN
(SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY t.name ORDER BY t.lastdateandtime DESC) as seqnum
FROM Truck002 t
) t
ON dt.truck = t.name
WHERE o.status = 'onRoute' AND seqnum = 1;
Firstly, why are you using Truck002's name field rather than its id field as the link to DispacthTable? This is considered a less efficient way of doing it than using id (which is either a numerical field or a shorter string than name).
Secondly, you should mention in your Question that each Order can have many DispatchTable's and that each DispacthTable can have many Truck002's, otherwise many people will start by assuming that it is the other way round between DispatchTable and Truck002.
Thirdly, please try...
SELECT DispatchTable.ordernumber,
DispatchTable.truck,
DispatchTable.driver,
DispatchTable.actualpickup,
DispatchTable.actualdropoff,
orders.pickupdateandtime,
orders.dropoffdateandtime,
Truck002.lastposition,
Truck002.lastdateandtime
FROM DispatchTable
INNER JOIN orders ON DispatchTable.ordernumber = orders.id
INNER JOIN Truck002 ON DispatchTable.truck = Truck002.name
WHERE (orders.status = 'onRoute')
GROUP BY ordernumber
HAVING lastdateandtime = MAX( lastdateandtime )
If you have any questions or comments, then please feel free to post a Comment accordingly.
Further Reading
https://msdn.microsoft.com/en-us/library/bb177906(v=office.12).aspx (on HAVING)
https://www.w3schools.com/sql/sql_having.asp (on HAVING)
https://msdn.microsoft.com/en-us/library/bb177905(v=office.12).aspx (on GROUP BY)
https://www.w3schools.com/sql/sql_groupby.asp (on GROUP BY)

SQL Join only if all records have a match

I have 3 tables:
CP_carthead (idOrder)
CP_cartrows (idOrder, idCartRow)
CP_shipping (idCartRow, idShipping, dateShipped)
There can be multiple idCartRows per idOrder.
I want to get all orders where all its idCartRows exist in CP_shipping. This seems like it should be simple, but I haven't found much on the web.
Here's my query now:
SELECT
s.idOrder
, s.LatestDateShipped
FROM
CP_carthead o
LEFT OUTER JOIN (
SELECT
MAX(s.dateShipped) [LatestDateShipped]
, r.idOrder
FROM
CP_shipping s
LEFT OUTER JOIN CP_cartrows r ON s.idCartRow = r.idCartRow
GROUP BY
r.idOrder
) s ON o.idOrder = s.idOrder
Your query is returning rows from "s" and not the orders. Based on your question, I came up with this query:
select o.*
from CP_Carthead o
where o.orderId in (select cr.idOrder
from cp_cartrows cr left outer join
cp_shipping s
on cr.idCartRow = s.IdCartrow
group by cr.idOrder
having count(s.idCartRow) = COUNT(*)
)
The subquery in the in statement is getting orders all of whose cartrows are in shipping.

SQL Server 2012 -- loop keys

I have written a query which left joins separate tables and attempts to discover at a point in time (the point of insertion) which PK keys are the newest in a db for a given record.
you will see i have pared it back to patientid 100 as this is the only way i can seem to get it working.
The current query works as shown:
SELECT TOP 1 P1.PatientID,
P1.DimPatientPK,
DA1.DimAdmissionPK,
DD1.DiagnosisPK,
DI1.Investigation1PK,
DIE1.InvestigationECGPK,
IEG1.InvestigationEchoGoldPK,
MH1.DimMedicalHistoryPK,
FH1.DimPatientFamilyHistoryPK,
PHT1.PatientHospitalisationTreatmentPK,
PMP1.PatientMedicalPersonnelPK,
RR1.PatientReferralReasonPK,
PEA1.PhysicalExamAHSPK,
PEM1.PhysicalExamMurmursPK,
SI1.SocialIssuePK,
TRT.TreatmentPK
--DT1.Treatment1PK
FROM
DimPatient P1 LEFT JOIN DimAdmission DA1 ON P1.PatientID = DA1.PatientID
LEFT JOIN DimDiagnosis DD1 ON P1.PatientID = DD1.PatientID
LEFT JOIN DimInvestigation1 DI1 ON P1.PatientID = DI1.PatientID
LEFT JOIN DimInvestigationECG DIE1 ON P1.PatientID = DIE1.PatientId
LEFT JOIN DimInvestigationECHOgold IEG1 ON P1.PatientID = DIE1.PatientId
LEFT JOIN DimMedicalHistory MH1 ON P1.PatientID = MH1.PatientId
LEFT JOIN DimPatientFamilyHistory FH1 ON P1.PatientId = FH1.PatientID
LEFT JOIN DimPatientHospitalisationTreatment PHT1 ON P1.PatientID = PHT1.PatientId
LEFT JOIN DimPatientMedicalPersonnel PMP1 ON P1.PatientID = PMP1.PatientId
LEFT JOIN DimPatientReferralReason RR1 ON P1.PatientID = RR1.PatientId
LEFT JOIN DimPhysicalExamAHS PEA1 ON P1.PatientId = PEA1.PatientId
LEFT JOIN DimPhysicalExamination PE1 ON P1.PatientID = PE1.PatientId
LEFT JOIN DimPhysicalExamMurmurs PEM1 ON P1.PatientID = PEM1.PatientId
LEFT JOIN DimSocialIssue SI1 ON P1.PatientID = SI1.PatientID
LEFT JOIN DimTreatment TRT ON P1.PatientID = TRT.PatientId
WHERE P1.patientid IN(100)
ORDER BY DA1.DimAdmissionPK DESC,
P1.DimPatientPK DESC,
DD1.DiagnosisPK DESC,
DI1.Investigation1PK DESC,
DIE1.InvestigationECGPK DESC,
IEG1.InvestigationEchoGoldPK DESC,
MH1.DimMedicalHistoryPK DESC,
FH1.DimPatientFamilyHistoryPK DESC,
PHT1.PatientHospitalisationTreatmentPK DESC,
PMP1.PatientMedicalPersonnelPK DESC,
RR1.PatientReferralReasonPK DESC,
PEA1.PhysicalExamAHSPK DESC,
PE1.PhysicalExaminationPK DESC,
PEM1.PhysicalExamMurmursPK DESC,
SI1.SocialIssuePK DESC,
TRT.TreatmentPK DESC;
This successfully recovers a full record whether it has been filled out or not for the patid 100.
I am having trouble expanding this so that it loops through and collects the same results for every patient in the db.
i.e. if i remove the where clause, i only get 1 row still ..
if i remove select top 1 .. then it returns me multiple sets of patientid 90 - i basically want 1 row for each patientID - ie 90, 91, 92 with the corresponding maximum key value from each table matched.
Anyone have any ideas on how to achieve this?
One (or more) tables you are joining is empty. Change your joins to left outer join, i.e.
LEFT OUTER JOIN DimAdmission DA1 ON P1.PatientID = DA1.PatientID
for all joined tables. The empty table will have Nulls in the results set columns.
If multiple selections for the same patient ID exist, then one or more of your tables has more than one record for each patient ID. You either need to exclude these tables from the query or look at your data structure and see if there is a field which will let you select single record.
I suggest you add the ROW_NUMBER function into your WHERE clause, e.g.
WHERE ROW_NUMBER() (PARTION BY P1.patientid ORDER BY DA1.DimAdmissionPK desc,
P1.DimPatientPK desc, ... ) = 1
By "..." I mean move your entire ORDER BY clause inside the ROW_NUMBER function.
Good luck - it seems you are missing a Fact table ...

How can I get all the rows from master table and relevent row from the detail table in MS-SQL?

I am using MS-SQL and I am trying to write a query which fetches rows from the master table and related rows from the detail table. Now, the thing I want is that it must only fetch the first row from the master table and related field from the detail tables should be blank in that first row, now if there are related rows found in the detail tables, they must be shown in the separate rows. I have been trying using the following query but it is not giving the desired result.
SELECT
DISTINCT
ProductMaster.ProductMasterID, ProductMaster.ProductName,
ProductMaster.SubCategoryID, SubCategory.SubCategoryName,
ProductBrandAndType.ProductBranAndTypeID, ProductBrandAndType.ProductType,
ProductBrandAndType.Brand, ProductMaster.ProductDesc,
ProductMaster.ReOrderLevel
FROM
ProductBrandAndType
RIGHT OUTER JOIN
Inward
ON ProductBrandAndType.ProductBranAndTypeID = Inward.ProductBrandAndTypeID
RIGHT OUTER JOIN
ProductMaster
ON Inward.ProductID = ProductMaster.ProductMasterID
LEFT OUTER JOIN
SubCategory
ON ProductMaster.SubCategoryID = SubCategory.SubCategoryID
ORDER BY
ProductMaster.ProductName,
ProductBrandAndType.ProductType,
ProductBrandAndType.Brand;
Can anyone help me on this?
Regards
Sikandar
Following query worked.
SELECT dbo.ProductMaster.ProductMasterID, dbo.ProductMaster.ProductName, dbo.ProductMaster.SubCategoryID, dbo.ProductMaster.ProductDesc,
dbo.ProductMaster.ReOrderLevel, null as ProductBrandAndTypeID,
null AS Type, null as Brand
FROM dbo.ProductBrandAndType RIGHT OUTER JOIN
dbo.Inward ON dbo.ProductBrandAndType.ProductBranAndTypeID = dbo.Inward.ProductBrandAndTypeID RIGHT OUTER JOIN
dbo.ProductMaster ON dbo.Inward.ProductID = dbo.ProductMaster.ProductMasterID LEFT OUTER JOIN
dbo.SubCategory ON dbo.ProductMaster.SubCategoryID = dbo.SubCategory.SubCategoryID
UNION
SELECT dbo.ProductMaster.ProductMasterID, dbo.ProductMaster.ProductName, dbo.ProductMaster.SubCategoryID, dbo.ProductMaster.ProductDesc,
dbo.ProductMaster.ReOrderLevel, dbo.ProductBrandAndType.ProductBranAndTypeID,
dbo.ProductBrandAndType.ProductType, dbo.ProductBrandAndType.Brand
FROM dbo.ProductBrandAndType RIGHT OUTER JOIN
dbo.Inward ON dbo.ProductBrandAndType.ProductBranAndTypeID = dbo.Inward.ProductBrandAndTypeID RIGHT OUTER JOIN
dbo.ProductMaster ON dbo.Inward.ProductID = dbo.ProductMaster.ProductMasterID LEFT OUTER JOIN
dbo.SubCategory ON dbo.ProductMaster.SubCategoryID = dbo.SubCategory.SubCategoryID
ORDER BY ProductName;
If you just what the first row from the ProductMaster. Then you can do something like this:
;WITH CTE
(
SELECT
ROW_NUMBER() OVER(ORDER BY ProductMaster.ProductMasterID) AS RowNbr,
ProductMaster.ProductName,
ProductMaster.SubCategoryID,
ProductMaster.ProductDesc,
ProductMaster.ReOrderLevel
FROM
ProductMaster
)
SELECT
ProductMaster.ProductMasterID,
ProductMaster.ProductName,
ProductMaster.SubCategoryID,
SubCategory.SubCategoryName,
ProductBrandAndType.ProductBranAndTypeID,
ProductBrandAndType.ProductType,
ProductBrandAndType.Brand,
ProductMaster.ProductDesc,
ProductMaster.ReOrderLevel
FROM
ProductBrandAndType
RIGHT OUTER JOIN Inward
ON ProductBrandAndType.ProductBranAndTypeID = Inward.ProductBrandAndTypeID
RIGHT OUTER JOIN CTE AS ProductMaster
ON Inward.ProductID = ProductMaster.ProductMasterID
AND RowNbr=1
LEFT OUTER JOIN SubCategory
ON ProductMaster.SubCategoryID = SubCategory.SubCategoryID
ORDER BY
ProductMaster.ProductName,
ProductBrandAndType.ProductType,
ProductBrandAndType.Brand;