LEFT OUTER JOIN with IS NULL - sql

I have a at LEFT OUTER JOIN chain which only works for 90% of the cases.
The Tables
days:
id, date_value
1, 2017-01-01
2, 2017-01-02
3, 2017-01-03
periods:
id, starts_on, ends_on, name, federal_state_id, country_id
1, 2017-01-01, 2017-01-01, Test1, 1, NULL
2, 2017-01-03, 2017-01-03, Test2, 2, NULL
slots:
id, days_id, periods_id
1, 1, 1
2, 3, 2
The SQL-Query
SELECT days.date_value, periods.name, periods.federal_state_id
FROM days
LEFT OUTER JOIN slots ON (days.id = slots.day_id)
LEFT OUTER JOIN periods ON (slots.period_id = periods.id)
WHERE days.date_value >= '2017-01-01' AND
days.date_value <='2017-01-03' AND
(periods.id IS NULL OR periods.country_id = 1 OR
periods.federal_state_id = 1)
ORDER BY days.date_value;
The Result
2017-01-01, Test1, 1
2017-01-02, NULL, NULL
But I'd like to get this result:
2017-01-01, Test1, 1
2017-01-02, NULL, NULL
2017-01-03, NULL, NULL
As fare as I understand it the periods.id IS NULL doesn't match for the last entry because there is period #2 with a federal_state_id of 2.
How do I have to change the SQL-Query to get the result I want?

If you move the condition on the periods entries to the join condition, they will not be part of the set that is joined with the slots, producing the desired NULL for name and federal_state_id. So the set that will be joined will only contain:
periods:
id, starts_on, ends_on, name, federal_state_id, country_id
1, 2017-01-01, 2017-01-01, Test1, 1, NULL
As such, every entry of date can be returned (except for the condition on the range). The query would then look like this:
SELECT
days.date_value,
periods.name,
periods.federal_state_id
FROM days
LEFT OUTER JOIN slots
ON (days.id = slots.day_id)
LEFT OUTER JOIN periods
ON (slots.period_id = periods.id) AND
((periods.country_id = 1 OR
periods.federal_state_id = 1))
WHERE
days.date_value >= '2017-01-01' AND
days.date_value <='2017-01-03'
ORDER BY
days.date_value;
Please notice that the condition periods.id IS NULL is not needed any more as the set the conditions are applied to consist of only slots and periods where every period entry will have a value for id as it is the primary key. And as slots and periods are joined by a left join slots, having no match in the periods table are not removed.
In your original query you first joined all the entries of period (if the id matched) and then removed all the entries that didn't have the desired country_id or federal_state_id. This removed the dates that did have a match in period but did not meet the condition.

I always recomend formating SQL statements. Errors are easier to find.
Move the conditions to the joins like:
SELECT days.date_value
,periods.name
,periods.federal_state_id
FROM dbo.days
LEFT OUTER JOIN dbo.slots
ON days.id = slots.days_id
LEFT OUTER JOIN dbo.periods
ON slots.periods_id = periods.id
AND (
periods.country_id = 1
OR periods.federal_state_id = 1
)
WHERE days.date_value >= '2017-01-01'
AND days.date_value <='2017-01-03'
ORDER BY days.date_value
;

Related

How to Grab Specific Row info?

The below is an example of what will output when you run the query open: select A.DispatchNote, A.MStockCode, A.NComment
from MdnMaster
MdnMaster.DispatchNote
MdnMaster.MStockCode
MdnMaster.NComment
12345/001
CAL2-01234-010-50L
12345/001
FREIGHT
12345/001
1 Parcel
12345/001
Trk# 1Z8R9V80013141323 - 5 lb
12345/001
Trk#: 1Z8R9V900381868191 -- 18 lb
12345/001
SHP 21401
12345/002
CAL3-0121-020-50L
12345/002
FREIGHT
12345/002
2 Parcels
12345/002
Trk# 1Z8R9V80013141323 - 5 lb
12345/002
Trk#: 1Z8R9V900381868191 -- 18 lb
12345/002
SHP 2140
I'm trying to do a query that'll grab just the first tracking number in the list. and ignore the second (or sometimes third they have)
The database has blank NComment lines when there's an MStockCode, and then the MStockCode lines are blank for every NComment line so I don't know what I'm doing.
What I have so far:
SELECT
m.DispatchNote,
MAX(d.MStockCode) as StockCode,
MAX(case when d.NComment like 'Trk%' then d.NComment end) as NComment,
MAX(m.CustomerPoNumber) as CustomerPO
FROM MdnMaster AS m
LEFT OUTER JOIN MdnDetail AS d on m.DispatchNote = d.DispatchNote
AND (d.NComment LIKE 'Trk%' OR d.MStockCode is not null)
and m.Customer = 'LAWSON'
and d.MLineShipDate =
case
when datepart(weekday, getdate() -1) = '7'
then DATEADD(hh,0,dateadd(DAY, datediff(day, 0, getdate()),-2)) -- if yesterday was Saturday, set to Friday
when datepart(weekday, getdate() -1) = '1'
then DATEADD(hh,0,dateadd(DAY, datediff(day, 0, getdate()),-3)) -- if yesterday was Sunday, set to Friday
else DATEADD(hh,0,dateadd(DAY, datediff(day, 0, getdate()),-1))
end
GROUP BY m.DispatchNote
My issue is that it gives me nothing since I only know how to ask it explicitly that I want the lines that aren't blank. How do I fix it?
EDIT: I should mention that all of the information comes from the MdnMaster Table (which is A) and MLineShipDate will come from B (MdnDetail). I omitted that information because I didn't think it was pertinent to the question at hand.
An example of what I want to see FROM above:
MdnMaster.DispatchNote
MdnMaster.MStockCode
MdnMaster.NComment
12345/001
CAL2-01234-010-50L
Trk# 1Z8R9V80013141323 - 5 lb
Here's a quick way to get some results. Hopefully, it will set you on the right path.
I'm assuming you can specify a second column to determine the order of the comments. Replace all instances of Line below with the actual column name.
Select
m1.DispatchNote,
m3.MStockCode,
m1.NComment
From
MdnMaster m1
Inner Join (
Select DispatchNote, Min(Line) as Line
From MdnMaster
Where NComment like 'Trk%'
Group by DispatchNote ) m2
on m1.DispatchNote = m2.DispatchNote and m1.Line = m2.Line
Inner Join (
Select DispatchNote, Max(MStockCode) as MStockCode
From MdnMaster
Group by DispatchNote ) m3
on m1.DispatchNote = m3.DispatchNote
One approach use a cross apply together with select top 1 to retrieve the tracking number.
select M.DispatchNote, M.MStockCode, TRK.NComment
from MdnMaster M
cross apply (
select top 1 M2.NComment
from MdnMaster M2
where M2.DispatchNote = M.DispatchNote
and M2.NComment LIKE 'Trk# %'
-- order by ?
) TRK
where M.MStockCode <> ''
Another approach is to join to a subselect that selects all tracking numbers and assigns row numbers withing each group. The final select would limit itself to those tracking numbers where row number = 1.
select M.DispatchNote, M.MStockCode, TRK.NComment
from MdnMaster M
join (
select M2.DispatchNote, M2.NComment,
row_number() OVER(PARTITION BY M2.DispatchNote order by (select null)) as RN
from MdnMaster M2
where M2.NComment LIKE 'Trk# %'
) TRK ON TRK.DispatchNote = M.DispatchNote
where M.MStockCode <> ''
and TRK.RN = 1
See this db<>fiddle for examples of both.
If there is a chance that there is no tracking number, but you still want to include the other results, change cross apply to outer apply in the first query, or the join to a left join in the second. A cross apply is like an inner join to a subselect, while an outer apply is like a left join.
If you have criteria that prefers one tracking number over another, include it in the order by clause of the subselect in the first query, or replace the order by (select null) placeholder clause in the second. Otherwise, an arbitrary tracking number will be selected.

Creating Daily In-Use table w/ Zeros When NULL

Hello Stack Community,
I am not sure if I titled this accurately, but I am attempting to create a table that tracks the daily in-use quantity by product code. Currently my code drops dates where a product isn't in-use whereas I need that to show as a 0.
My thoughts where that by using the date from the date table that my LEFT OUTER JOIN with the ISNULL on the field would produce a 0, but nay.
Here is my code, with a screenshot of what it outputs with the red square highlighting where it's missing date records that I need to show as 0 :
SELECT
DD.DATE,
DE.PRODUCT_CODE,
--OOC = OUT OF CONTEXT, EITHER ISN'T CHARGEABLE OR ISN'T CURRENTLY ACTIVE
ISNULL(SUM(LIDV.QTY - LIDV.QTYSUB),0),
OD.LOCATION,
OD.SOURCE
FROM Dim_Date AS DD
LEFT OUTER JOIN ORDERv_DatesDays AS OD ON DD.DATE BETWEEN OD.SHIP_DATE AND OD.adjRETURN_DATE
LEFT OUTER JOIN FACT_Orders_LIDs AS LIDV ON LIDV.SORDERID_DAX = OD.SORDERID_DAX
LEFT OUTER JOIN DIM_ECODES AS DE ON DE.PRODUCT_CODE = LIDV.eCODE
WHERE
--DD.DATE = '3/1/2017' AND
DD.DATE BETWEEN '1/1/2017' AND EOMONTH( DATEADD( MONTH , -1, CURRENT_TIMESTAMP ) ) AND
DE.PRODUCT_CODE = '07316-' AND
YEAR(DD.DATE) = 2017
GROUP BY
DD.DATE,
DE.PRODUCT_CODE,
OD.LOCATION,
OD.SOURCE
ORDER BY
DD.DATE
I also thought, since I'm no SQL expert, that perhaps I need to just create a table with each product code and date for a specified date range but I got tripped up trying to create that as well.
Thank you for any assistance, if I need to add more info just let me know what I'm missing.
This WHERE predicate is killing your left join:
DE.PRODUCT_CODE = '07316-' AND
If product_code 07316 was not "out on loan" (or whatever) between Feb 24 and April 6 then all those rows would have looked like:
DATE PRODUCT_CODE INUSE LOCATION
2017-02-25 NULL NULL NULL
2017-02-26 NULL NULL NULL
2017-02-27 NULL NULL NULL
2017-02-28 NULL NULL NULL
...
2017-04-05 NULL NULL NULL
But, that NULL in product_code means that when the where clause asks "is NULL equal to 07316- ?" the answer is false, so the row diasppears from the resultset
Consider
LEFT OUTER JOIN DIM_ECODES AS DE
ON
DE.PRODUCT_CODE = LIDV.eCODE AND
DE.PRODUCT_CODE = '07316-'
You might also want to make some changes in the SELECT block too:
'07316-' as PRODUCT_CODE,
COALESCE(INUSE,0) AS INUSE
It might make more sense to you to write it like this:
FROM
Dim_Date AS DD
LEFT OUTER JOIN
(
SELECT
OD.SHIP_DATE,
OD.adjRETURN_DATE,
LIDV.QTY,
LIDV.QTYSUB,
OD.LOCATION,
OD.SOURCE
FROM
ORDERv_DatesDays AS OD
INNER JOIN FACT_Orders_LIDs AS LIDV ON LIDV.SORDERID_DAX = OD.SORDERID_DAX
INNER JOIN DIM_ECODES AS DE ON DE.PRODUCT_CODE = LIDV.eCODE
WHERE
DE.PRODUCT_CODE = '07316-'
) x
ON DD.DATE BETWEEN x.SHIP_DATE AND x.adjRETURN_DATE
WHERE
This is "list of dates on the left" and "any relevant data, already joined together and where'd on the right"
It should also be noted that if you're doing this for multiple product codes, to prevent just a single date row if both product 07316 and 07317 are in use on the 28th Feb you'd need to:
FROM
(
SELECT DISTINCT DD.DATE, DE.PRODUCT_CODE
FROM Dim_Date AS DD CROSS JOIN DIM_ECODES DE
WHERE ..date range clause..
)
This takes your list of dates, and crosses it with your list of prod codes, so you can be certain there are at least these two rows:
2017-02-28 07316-
2017-02-28 07317-
Then when you left join the products on date and product code, both those rows' data survive the left join, and become associated with nulls:
2017-02-28 07316- NULL NULL
2017-02-28 07317- NULL NULL
Without doing that CROSS, you'd have just one row (null in product code)

How to evaluate multiple records against a single record to find the absence of matching values?

I'm creating some QA/QC queries to clean up my data and I am trying to find the absence of matching values between two tables. In total, I have three tables, Inspections (INSP), Risk Assessments (RA) and Work Orders (WO). Inspections generate Risk Assessments (INSP.GlobalID = RA.InspectionGlobalID) and are 1 to many. Risk Assessments generate Work Orders and are many to 1 (RA.WorkOrderGlobalID = WO.GlobalID). Inspections and Work Orders are 1 to 1 (INSP.GlobalID = WO.InspectionGlobalID). They both have a "Priority" field which is a smallint , that ranges from 0-12 and indicates how important the work order is and when it gets done (12 being the most critical). The issue I'm having is that there can be multiple RA records associated to a single work order. I'm trying to find instances where there is an absence of matching values between the Risk Assessment(s) Priority to the Work Order Priority. For example, there can be 3 RA's with Priorities 6, 8, 10. The Work Order could have the 8 Priority (which is acceptable) and in this case I wouldn't want to select any of these RA's because there is a matching Priority in the group, but my query is selecting the 6 and 10 Priority RA's from the group. How do I evaluate all RA's associated to a INSP and select records where there is no matching Priority at all between the two tables (RA/WO).
SELECT
RA.Priority,
WO.Priority ,
RA.InspectionGlobalID as RA_INSP_GLBID,
WO.InspectionGlobalID,
RA.WorkOrderGlobalID as RAWOGLBID,
WO.GlobalID,
WO.OBJECTID as WOID,
INSP.GlobalID,
INSP.OBJECTID as INID,
RA.OBJECTID RA_OBJECTID
FROM
CFAdmin.RISKASSESSMENT_EVW as RA INNER JOIN
CFAdmin.WORKORDER_EVW AS WO ON WO.GlobalID = RA.WorkOrderGlobalID LEFT OUTER JOIN
CFAdmin.INSPECTION_EVW as INSP ON INSP.GlobalID = RA.InspectionGlobalID LEFT OUTER JOIN
CFAdmin.PLANTINGSPACE_EVW as PS ON INSP.PlantingSpaceGlobalID = PS.GlobalID
WHERE
RA.Priority <> WO.Priority AND
INSP.InspectionDate IS NOT NULL AND
(INSP.CreatedDate > '7/1/2018') AND
WO.CancelDate IS NULL AND
WO.Status <>2 AND
(WO.CreatedDate > '7/1/2018') AND
WO.Type NOT IN (17,18, 44,45,3) AND
WO.WOEntity = 0
Boiling your SQL down to just the part that relates to this question:
(You're not even using anything from PLANTINGSPACE_EVW)
SELECT RA.Priority
, WO.Priority
, RA.WorkOrderGlobalID as RAWOGLBID
, WO.GlobalID
FROM CFAdmin.RISKASSESSMENT_EVW as RA
INNER JOIN CFAdmin.WORKORDER_EVW AS WO ON WO.GlobalID = RA.WorkOrderGlobalID
AND WO.Priority <> RA.Priority
WHERE WO.CancelDate IS NULL
AND WO.Status <> 2
AND WO.CreatedDate > cast('7/1/2018' as date)
AND WO.Type NOT IN (17, 18, 44, 45, 3)
AND WO.WOEntity = 0
You should get all of the records where the priorities don't match.
I can't test without sample data, but how about something like this?
SELECT RA.Priority
, WO.Priority
, RA.InspectionGlobalID as RA_INSP_GLBID
, WO.InspectionGlobalID
, RA.WorkOrderGlobalID as RAWOGLBID
, WO.GlobalID
, WO.OBJECTID as WOID
, RA.OBJECTID RA_OBJECTID
FROM CFAdmin.RISKASSESSMENT_EVW as RA
INNER JOIN CFAdmin.WORKORDER_EVW AS WO ON WO.GlobalID = RA.WorkOrderGlobalID
WHERE WO.GlobalId not in (
SELECT distinct WO2.GlobalID
FROM CFAdmin.RISKASSESSMENT_EVW as RA2
INNER JOIN CFAdmin.WORKORDER_EVW AS WO2 ON WO2.GlobalID = RA2.WorkOrderGlobalID
AND WO2.Priority = RA2.Priority
)
AND WO.CancelDate IS NULL
AND WO.Status <> 2
AND (WO.CreatedDate > '7/1/2018')
AND WO.Type NOT IN (17, 18, 44, 45, 3)
AND WO.WOEntity = 0

JOIN Pulling 2 times rows than expected

I've 3 tables - Table tblFactorDefinition, tblFamily and tblConstituent.
tblFactorDefinition has FamilyID with corrosponding Factors in FieldName Column(Namely Factor1, Factor2,.....Factor9)
Table tblConstituent has associated Factors value (Value for Factor1, Factor2,..Factor9 if it exist) for each constituent within the Family and can be joined by FamilyID with FamilyID in tbLFacctorDefinition.
Table tblFamily has Family details. (i.e. FamilyTypeID=1 is Index or and FamilyTypeID=2 is an ETF).
While trying to retrieve FamilyID with Factors corrosponding Factors Value in tblConstituent I get 2-3 times the rows. For example, FamilyID =10216 has 27975 constituents but my query fetches more than 55k+ rows. I am upto the wall trying to figure out JOIN.
SELECT DISTINCT tc.FamilyID,
tfd.FieldName,
tc.Factor1,
tc.Factor2,
tc.Factor3,
tc.Factor4,
tc.Factor5,
tc.Factor6,
tc.Factor7,
tc.Factor8,
tc.Factor9,
tf.OpenDate
FROM soladbserver..tblFamily tf
JOIN soladbserver..tblFactorDefinition tfd
ON tfd.FamilyID = tf.FamilyID
JOIN soladbserver..tblConstituent tc
ON tc.FamilyID = tf.FamilyID
AND tc.StartDate <= Getdate()
AND tc.EndDate > Getdate()
WHERE tf.OpenDate = Cast(Getdate() AS DATE)
AND tf.FamilyTypeID = 1
AND tf.DataProviderID = 2
AND tf.FamilyID IN ( 10216 )
I am expecting 27975 rows with factor values for corrosponding FieldName Factor1, Factor2,...,Factor9) given all has values.
Screenshot 1 is tblConstituent table,
Secreenshot 2 is tblFactorDefinition table,
Screenshot 3,4,5 is tblFamily table:
Change the join to "Left Outer Join", and use the sql subquery select statement to pull the fieldname and see what you get. If the FamilyID a primary key in the tc table and a foreign key in the others, this should get you where you want to be.
SELECT tf.FamilyID,
(Select top 1 isNull(tfd.FieldName,'') from soladbserver..tblFactorDefinition tfd
where tfd.FamilyID = tf.FamilyID ) as FieldName, -- this assumes each familyID only has one tfd.FieldName -- if not change both to left outer joins and leave the rest as is and run it
tc.Factor1,
tc.Factor2,
tc.Factor3,
tc.Factor4,
tc.Factor5,
tc.Factor6,
tc.Factor7,
tc.Factor8,
tc.Factor9,
tf.OpenDate
FROM soladbserver..tblFamily tf
left outer JOIN soladbserver..tblConstituent tc
ON tc.FamilyID = tf.FamilyID
AND tc.StartDate <= Getdate()
AND tc.EndDate > Getdate()
WHERE tf.OpenDate = Cast(Getdate() AS DATE)
AND tf.FamilyTypeID = 1
AND tf.DataProviderID = 2
AND tf.FamilyID IN ( 10216 )

How to force postgres to return 0 even if there are no rows matching query, using coalesce, group by and join

I've been trying hopelessly to get the following SQL statement to return the query results and default to 0 if there are no rows matching the query.
This is the intended result:
vol | year
-------+------
0 | 2018
Instead I get:
vol | year
-----+------
(0 rows)
Here is the sql statement:
select coalesce(vol,0) as vol, year
from (select sum(vol) as vol, year
from schema.fact_data
join schema.period_data
on schema.fact_data.period_tag = schema.period_data.tag
join schema.product_data
on schema.fact_data.product_tag =
schema.product_data.tag
join schema.market_data
on schema.fact_data.market_tag = schema.market_data.tag
where "retailer"='MadeUpRetailer'
and "product_tag"='FakeProductTag'
and "year"='2018' group by year
) as DerivedTable;
I know the query works because it returns data when there is data. Just doesn't default to 0 as intended...
Any help in finding why this is the case would be much appreciated!
Using your subquery DerivedTable, you could write:
SELECT coalesce(DerivedTable.vol, 0) AS vol,
y.year
FROM (VALUES ('2018'::text)) AS y(year)
LEFT JOIN (SELECT ...) AS DerivedTable
ON DerivedTable.year = y.year;
Remove the GROUP BY (and the outer query):
select 2018 as year, coalesce(sum(vol), 0) as vol
from schema.fact_data f join
schema.period_data p
on f.period_tag = p.tag join
schema.product_data pr
on f.product_tag = pr.tag join
schema.market_data m
on fd.market_tag = m.tag
where "retailer" = 'MadeUpRetailer' and
"product_tag" = 'FakeProductTag' and
"year" = '2018';
An aggregation query with no GROUP BY always returns exactly one row, so this should do what you want.
EDIT:
The query would look something like this:
select v.yyyy as year, coalesce(sum(vol), 0) as vol
from (values (2018), (2019)) v(yyyy) left join
schema.fact_data f
on f.year = v.yyyy left join -- this is just an example. I have no idea where year is coming from
schema.period_data p
on f.period_tag = p.tag left join
schema.product_data pr
on f.product_tag = pr.tag left join
schema.market_data m
on fd.market_tag = m.tag
group by v.yyyy
However, you have to move the where conditions to the appropriate on clauses. I have no idea where the columns are coming from.
From the code you posted it is not clear in which table you have the year column.
You can use UNION to fetch just 1 row in case there are no rows in that table for the year 2018 like this:
select sum(vol) as vol, year
from schema.fact_data innrt join schema.period_data
on schema.fact_data.period_tag = schema.period_data.tag
inner join schema.product_data
on schema.fact_data.product_tag = schema.product_data.tag
inner join schema.market_data
on schema.fact_data.market_tag = schema.market_data.tag
where
"retailer"='MadeUpRetailer' and
"product_tag"='FakeProductTag' and
"year"='2018'
group by "year"
union
select 0 as vol, '2018' as year
where not exists (
select 1 from tablename where "year" = '2018'
)
In case there are rows for the year 2018, then nothing will be fetched by the 2nd query,