LEFT OUTER JOIN not always matching - sql

I'm starting with a SQL query with a couple of joins and I'm getting the exact data I expect. This is what the current query is.
SELECT DISTINCT o.OrganizationHierarchyUnitLevelFourCd, o.OrganizationHierarchyUnitLevelThreeNm, o.OrganizationHierarchyUnitLevelFourNm
FROM Lab_Space l
JOIN Worker w ON l.Contact_WWID = w.WWID AND w.Employee_Status_Code = 'A'
JOIN Org_Hierarchy o ON o.OrganizationHierarchyUnitLevelThreeNm IS NOT NULL AND w.Org_Hierarchy_Unit_Cd = o.OrganizationHierarchyUnitCd
ORDER BY o.OrganizationHierarchyUnitLevelThreeNm, o.OrganizationHierarchyUnitLevelFourNm
This ends up with a row like
1234 | Finance | IT
Now I've created a new table, where I'm tracking whether or not I want to include the organization in my output. That table just has two columns, an org ID and a bit field. So I thought I could LEFT OUTER JOIN, since the second table won't have data on all orgs, so I expanded the query to this:
SELECT DISTINCT o.OrganizationHierarchyUnitLevelFourCd, o.OrganizationHierarchyUnitLevelThreeNm, o.OrganizationHierarchyUnitLevelFourNm, v.Include
FROM Lab_Space l
JOIN Worker w ON l.Contact_WWID = w.WWID AND w.Employee_Status_Code = 'A'
JOIN Org_Hierarchy o ON o.OrganizationHierarchyUnitLevelThreeNm IS NOT NULL AND w.Org_Hierarchy_Unit_Cd = o.OrganizationHierarchyUnitCd
LEFT OUTER JOIN Validation_Email_Org_Unit_Inclusion v ON o.OrganizationHierarchyUnitCd = v.OrganizationHierarchyUnitCd
ORDER BY o.OrganizationHierarchyUnitLevelThreeNm, o.OrganizationHierarchyUnitLevelFourNm
The problem I have is now I end up with rows like so:
1234 | Finance | IT | NULL
1234 | Finance | IT | 1
Since the Validation_Email_Org_Unit_Inclusion table includes a 1 for the 1234 org, I would expect to just get a single row with a value of 1, not include the row with NULL.
What have I done wrong?

You output OrganizationHierarchyUnitLevelFourCd but currently join on OrganizationHierarchyUnitCd. Join on the same column you output to get the corresponding value.
SELECT DISTINCT o.OrganizationHierarchyUnitLevelFourCd, ...
...
LEFT OUTER JOIN Validation_Email_Org_Unit_Inclusion v ON o.OrganizationHierarchyUnitLevelFourCd = v.OrganizationHierarchyUnitCd
...

Related

How to find difference between table with multiple conditions

I have exact two tables but some value differences. So I would like to find those differences with condition that if the column value has a difference of more than 10.
For example, all 9 columns have the same values in both tables, but the difference between the values column is 11, so this record is different. If the value difference is 9 so records are the same.
So I wrote this query to get differences:
select *
from test.test m
inner join test.test1 t
on
m.month_date = t.month_date and
m.level_1 = t.level_1 and
m.level_2 = t.level_2 and
m.level_3 = t.level_3 and
m.level_4 = t.level_4 and
m.level_header = t.level_header and
m.unit = t.unit and
m.model_type_id = t.model_type_id and
m.model_version_desc = t.model_version_desc
where m.month_date = '2022-11-01' and abs(m.value - t.value) > 10)
so this returns me all records that all column values are matched but did not pass the value difference condition.
Second, i have full outer join to get all differences
select *
from test.test m
full outer join test.test1 t
on
m.month_date = t.month_date and
m.level_1 = t.level_1 and
m.level_2 = t.level_2 and
m.level_3 = t.level_3 and
m.level_4 = t.level_4 and
m.level_header = t.level_header and
m.unit = t.unit and
m.model_type_id = t.model_type_id and
m.model_version_desc = t.model_version_desc
where m.month_date is null or t.month_date is null and m.month_date = '2022-11-01'
How can I combine the results of these two queries without UNION? I want to have only one query (sub query is acceptable)
Assuming that for a given day, you need to find
rows that match between the tables but exceed the value difference threshold
AND
rows present in either left or right table, that don't have a corresponding row in the other table
select *
from test.test m
full outer join test.test1 t
using (
month_date,
level_1,
level_2,
level_3,
level_4,
level_header,
unit,
model_type_id,
model_version_desc )
where (m.month_date is null
or t.month_date is null
and m.month_date = '2022-11-01' )
or (m.month_date = '2022-11-01' and abs(m.value - t.value) > 10);
Online demo
Since the columns used to join the tables have the same names, you can shorten their list by swapping out the lengthy table1.column1=table2.column1 and... list of pairs for a single USING (month_date,level_1,level_2,level_3,...) (doc). As a bonus, it will avoid listing the matching columns twice in your output, once for the left table, once for the right table.
select *
from (select 1,2,3) as t1(a,b,c)
full outer join
(select 1,2,3) as t2(a,b,c)
on t1.a=t2.a
and t1.b=t2.b
and t1.c=t2.c;
-- a | b | c | a | b | c
-----+---+---+---+---+---
-- 1 | 2 | 3 | 1 | 2 | 3
select *
from (select 1,2,3) as t1(a,b,c)
full outer join
(select 1,2,3) as t2(a,b,c)
using(a,b,c);
-- a | b | c
-----+---+---
-- 1 | 2 | 3
In your first query, you can replace the null values for a specific number. Something like this:
where m.month_date = '2022-11-01' and abs(ISNULL(m.value,-99) - ISNULL(t.value,-99)) > 10)
The above will replace the nulls for -99 (choose an appropriate value for your data), so if you have that m.value is 10 and t.value is null, then should be returned in your first query.

What is outer apply in SQL?

select *
from Kosten_Test a
left join T_Pflege_Parameter pv on pv.Parameterbezeichnung = 'Dummy' and pv.Parametereigenschaft = 'Dummy3'
left join T_Pflege_Parameter pp on pp.Parameterbezeichnung = 'Dummy2' and pp.Parametereigenschaft = a.Herkunft
outer apply( select max(VarianteID) as VarianteID, MerkmalTyp, MerkmalWert
from Test2
where MerkmalTyp = pv.Parameterwert and MerkmalWert = sku
group by MerkmalWert, MerkmalTyp
union
select max(VarianteID) as VarianteID, MerkmalTyp, MerkmalWert
from Testvariante_ASIN_SKU
where MerkmalTyp = pv.Parameterwert and MerkmalWert = a.asin
group by MerkmalWert, MerkmalTyp
) vm
left join (select distinct bundleid from T_Archiv_BundleKomponente) bk on bk.BundleID = vm.VarianteID
Because of the Outer Apply Statement i become always double results. Who can help?
Outer apply is not like a LEFT JOIN. It will apply all the values from the OUTER APPLY Statement to the data joined against.
If your query without the outer apply returns 5 rows and the outer apply query returns 5 rows then the resulting dataset would contain 25 records with each record from the outer apply joined to each row of the other data.
Often times data would be condensed down using aggregations of the values returned from the outer apply query grouped inside of the main query.
Example -
Q1
----
A
B
C
OuterApplyQuery
-----------------
1
2
3
SELECT * FROM Q1 OUTER APPLY (SELECT * FROM OuterApplyQuery) AS X
Result
---------
A 1
A 2
A 3
B 1
B 2
B 3
C 1
C 2
C 3

Returning a number when result set is null

Each lot object contains a corresponding list of work orders. These work orders have tasks assigned to them which are structured by the task set on the lots parent (the phase). I am trying to get the LOT_ID back and a count of TASK_ID where the TASK_ID is found to exist for the where condition.
The problem is if the TASK_ID is not found, the result set is null and the LOT_ID is not returned at all.
I have uploaded a single row for LOT, PHASE, and WORK_ORDER to the following SQLFiddle. I would have added more data but there is a fun limiter .. err I mean character limiter to the editor.
SQLFiddle
SELECT W.[LOT_ID], COUNT(*) AS NUMBER_TASKS_FOUND
FROM [PHASE] P
JOIN [LOT] L ON L.[PHASE_ID] = P.[PHASE_ID]
JOIN [WORK_ORDER] W ON W.[LOT_ID] = L.[LOT_ID]
WHERE P.[TASK_SET_ID] = 1 AND W.[TASK_ID] = 41
GROUP BY W.[LOT_ID]
The query returns the expected result when the task id is found (46) but no result when the task id is not found (say 41). I'd expect in that case to see something like:
+--------+--------------------+
| LOT_ID | NUMBER_TASKS_FOUND |
+--------+--------------------+
| 500 | 0 |
| 506 | 0 |
+--------+--------------------+
I have a feeling this needs to be wrapped in a sub-query and then joined but I am uncertain what the syntax would be here.
My true objective is to be able to pass a list of TASK_ID and get back any LOT_ID that doesn't match, but for now I am just doing a query per task until I can figure that out.
You want to see all lots with their counts for the task. So either outer join the tasks or cross apply their count or use a subquery in the select clause.
select l.lot_id, count(wo.work_order_id) as number_tasks_found
from lot l
left join work_order wo on wo.lot_id = l.lot_id and wo.task_id = 41
where l.phase_id in (select p.phase_id from phase p where p.task_set_id = 1)
group by l.lot_id
order by l.lot_id;
or
select l.lot_id, w.number_tasks_found
from lot l
cross apply
(
select count(*) as number_tasks_found
from work_order wo
where wo.lot_id = l.lot_id
and wo.task_id = 41
) w
where l.phase_id in (select p.phase_id from phase p where p.task_set_id = 1)
order by l.lot_id;
or
select l.lot_id,
(
select count(*)
from work_order wo
where wo.lot_id = l.lot_id
and wo.task_id = 41
) as number_tasks_found
from lot l
where l.phase_id in (select p.phase_id from phase p where p.task_set_id = 1)
order by l.lot_id;
Another option would be to outer join the count and use COALESCE to turn null into zero in your result.

Is there any way to show the records using same query?

I want to list out notices which are not sent. So I tried the query like below. But its showing wrong result. Is there any way to show notices which are not sent using the following query.
SELECT
vtn.*,
vn.id as notice_id,
vn.vnotice_datetime as sent_notice_time
FROM
vtemplates vt
LEFT JOIN vtemplate_notices vtn ON( vtn.vtemplate_id = vt.id)
LEFT JOIN vnotices vn ON(vn.vtemplate_notice_id = vtn.id AND vn.vnotice_datetime IS nULL)
LEFT JOIN violations v ON ( v.vtemplate_id = vt.id)
WHERE
v.id = 1
Records in a violation_notices table are as follows:
--------------------------------------------------------------
id vtemplate_notice_id desc vnotice_datetime created_on
---------------------------------------------------------------
1 1 test1 22/12/2018 05:30 22/12/2018
Expected Result:
id vtemplate_id created_on notice_id sent_notice_time
---------------------------------------------------------------
2 1 23/12/2018 NULL NULL
3 1 24/12/2018 NULL NULL
4 1 24/12/2018 NULL NULL
Actual Result:
id vtemplate_id created_on notice_id sent_notice_time
---------------------------------------------------------------
1 1 22/12/2018 NULL NULL
2 1 23/12/2018 NULL NULL
3 1 24/12/2018 NULL NULL
4 1 24/12/2018 NULL NULL
In actual result, it shows first record (which should not come) for which vnotice_datetime is NOT NULL but still it's showing.
Well, left joins don't remove non matching rows. Shifting the IS NULL check from the ON to the WHERE clause might work.
SELECT vtn.*,
vn.id notice_id,
vn.vnotice_datetime sent_notice_time
FROM vtemplates vt
LEFT JOIN vtemplate_notices vtn
ON vtn.vtemplate_id = vt.id
LEFT JOIN vnotices vn
ON vn.vtemplate_notice_id = vtn.id
LEFT JOIN violations v
ON v.vtemplate_id = vt.id
WHERE v.id = 1
AND vn.vnotice_datetime IS NULL;
You can use NOT EXISTS or test the joined column IS NULL in the WHERE clause
https://dev.mysql.com/doc/refman/8.0/en/exists-and-not-exists-subqueries.html
e.g
SELECT * FROM violations
WHERE NOT EXISTS(
SELECT * FROM notifications
WHERE violation_id = violations.id
)
SELECT v.*, n.* FROM violations v
LEFT JOIN notifications n
ON n.violation_id = v.id
WHERE n.violation_id IS NULL

Is there a simpler way to write this query? [MS SQL Server]

I'm wondering if there is a simpler way to accomplish my goal than what I've come up with.
I am returning a specific attribute that applies to an object. The objects go through multiple iterations and the attributes might change slightly from iteration to iteration. The iteration will only be added to the table if the attribute changes. So the most recent iteration might not be in the table.
Each attribute is uniquely identified by a combination of the Attribute ID (AttribId) and Generation ID (GenId).
Object_Table
ObjectId | AttribId | GenId
32 | 2 | 3
33 | 3 | 1
Attribute_Table
AttribId | GenId | AttribDesc
1 | 1 | Text
2 | 1 | Some Text
2 | 2 | Some Different Text
3 | 1 | Other Text
When I query on a specific object I would like it to return an exact match if possible. For example, Object ID 33 would return "Other Text".
But if there is no exact match, I would like for the most recent generation (largest Gen ID) to be returned. For example, Object ID 32 would return "Some Different Text". Since there is no Attribute ID 2 from Gen 3, it uses the description from the most recent iteration of the Attribute which is Gen ID 2.
This is what I've come up with to accomplish that goal:
SELECT attr.AttribDesc
FROM Attribute_Table AS attr
JOIN Object_Table AS obj
ON obj.AttribId = obj.AttribId
WHERE attr.GenId = (SELECT MIN(GenId)
FROM(SELECT CASE obj2.GenId
WHEN attr2.GenId THEN attr2.GenId
ELSE(SELECT MAX(attr3.GenId)
FROM Attribute_Table AS attr3
JOIN Object_Table AS obj3
ON obj3.AttribId = attr3.AttribId
WHERE obj3.AttribId = 2
)
END AS GenId
FROM Attribute_Table AS attr2
JOIN Object_Table AS obj2
ON attr2.AttribId = obj2.AttribId
WHERE obj2.AttribId = 2
) AS ListOfGens
)
Is there a simpler way to accomplish this? I feel that there should be, but I'm relatively new to SQL and can't think of anything else.
Thanks!
The following query will return the matching value, if found, otherwise use a correlated subquery to return the value with the highest GenId and matching AttribId:
SELECT obj.Object_Id,
CASE WHEN attr1.AttribDesc IS NOT NULL THEN attr1.AttribDesc ELSE attr2.AttribDesc END AS AttribDesc
FROM Object_Table AS obj
LEFT JOIN Attribute_Table AS attr1
ON attr1.AttribId = obj.AttribId AND attr1.GenId = obj.GenId
LEFT JOIN Attribute_Table AS attr2
ON attr2.AttribId = obj.AttribId AND attr2.GenId = (
SELECT max(GenId)
FROM Attribute_Table AS attr3
WHERE attr3.AttribId = obj.AttribId)
In the case where there is no matching record at all with the given AttribId, it will return NULL. If you want to get no record at all in this case, make the second JOIN an INNER JOIN rather than a LEFT JOIN.
Try this...
Incase the logic doesn't find a match for the Object_table GENID it maps it to the next highest GENID in the ON clause of the JOIN.
SELECT AttribDesc
FROM object_TABLE A
INNER JOIN Attribute_Table B
ON A.AttrbId = B.AttrbId
AND (
CASE
WHEN A.Genid <> B.Genid
THEN (
SELECT MAX(C.Genid)
FROM Attribute_Table C
WHERE A.AttrbId = C.AttrbId
)
ELSE A.Genid
END
) -- Selecting the right GENID in the join clause should do the job
= B.Genid
This should work:
with x as (
select *, row_number() over (partition by AttribId order by GenId desc) as rn
from Attribute_Table
)
select isnull(a.attribdesc, x.attribdesc)
from Object_Table o
left join Attribute_Table a
on o.AttribId = a.AttribId and o.GenId = a.GenId
left join x on o.AttribId = x.AttribId and rn = 1