Left Join On And clause not supported - sql

I've looked into various posts (this one, that one and this other one) and thought I got the answer.
After a LEFT JOIN I may add an ON [condition] AND [other condition] (I've also tried WHERE). But computer says no. Access keeps saying the join expression is not supported.
Consider the student_records table below:
STUDENTCODE | SEMESTERINDEX
12345 | 20112
12345 | 20113
12345 | 20121
67890 | 0
67890 | 20111
67890 | 20112
I want to find the minimum SEMESTERINDEX for each student from my students table, that's above 20001. (Records below may be erroneous and the 0 and 1 SEMESTERINDEX is used for transferred credits.)
I'm using access so there are VBA functions inside the SQL. There's several more tables I'm joining too, I'm quoting the whole query.
SELECT students.STUDENTCODE, prefixes.PREFIXNAMEENG,
students.STUDENTNAMEENG, students.STUDENTSURNAMEENG, levels.level_name, programs.PROGRAMNAMEENG, calendars.calendar_load,
MAX(student_records.SEMESTERINDEX) AS latest_semester, MIN(student_records.SEMESTERINDEX) AS intake_semester,
FROM student_records LEFT JOIN (
(
(
(
(students LEFT JOIN prefixes ON students.PREFIXID = prefixes.PREFIXID)
LEFT JOIN levels ON students.LEVELID = levels.level_id)
LEFT JOIN programs ON students.PROGRAMID = programs.PROGRAMID)
LEFT JOIN calendar_conversion ON students.SCHEDULEGROUPID = calendar_conversion.schedule_id)
LEFT JOIN calendars ON calendar_conversion.calendar_id = calendars.calendar_id) ON student_records.STUDENTCODE = students.STUDENTCODE AND student_records.SEMESTERINDEX> 2001
GROUP BY students.STUDENTCODE, prefixes.PREFIXNAMEENG, students.STUDENTNAMEENG, students.STUDENTSURNAMEENG, levels.level_name, programs.PROGRAMNAMEENG, calendars.calendar_load;
So did I misplace the AND student_records.SEMESTERINDEX > 2001?

oh my save me from these parenthesis and crazy indenting.
Here is how you do it. All the parenthesis don't matter in SQL
SELECT
students.STUDENTCODE,
prefixes.PREFIXNAMEENG,
students.STUDENTNAMEENG,
students.STUDENTSURNAMEENG,
levels.level_name,
programs.PROGRAMNAMEENG,
calendars.calendar_load,
minmax.latest_semester,
minmax.intake_semester,
FROM student_records
LEFT JOIN (
SELECT
studentcode,
MAX(student_records.SEMESTERINDEX) AS latest_semester,
MIN(student_records.SEMESTERINDEX) AS intake_semester
FROM students
WHERE students.STUDENTCODE > 2001
GROUP BY studentcode
) as MinMax ON student_records.STUDENTCODE = minmax.STUDENTCODE
LEFT JOIN students ON student_records.STUDENTCODE = students.STUDENTCODE
LEFT JOIN prefixes ON students.PREFIXID = prefixes.PREFIXID
LEFT JOIN levels ON students.LEVELID = levels.level_id
LEFT JOIN programs ON students.PROGRAMID = programs.PROGRAMID
LEFT JOIN calendar_conversion ON students.SCHEDULEGROUPID = calendar_conversion.schedule_id
LEFT JOIN calendars ON calendar_conversion.calendar_id = calendars.calendar_id
This is called a sub-query in sql it allows you to perform your grouping on a sub-set and then join that back to the rest of the data.
I think you went wrong thinking there was something about the join that needed a filter -- in fact it is the data that you were joining to that needed to be filtered.

Related

Why do I have multiple entries per entity in the query output?

Would like to know why my query displays multiple entries per entity in the output.
From what I understand there is only one active policy per entity.
Created query with SQL Server Management Studio, my output to display correctly has parameters, and I have tried the following with my query.
Currently my SQL SSMS query output displays the following:
Entity_Number Building_Name PolicyID Description Start_Date End_Date
400 Xpress 4 5 Day Grace 7/1/2019 9/27/2019
400 Xpress 18 2 Day Grace 7/3/2018 7/13/2018
400 Xpress 19 4 Day Grace 2/27/2019 2/27/2019
What I really would like to know is how do I drill down and find out why my query returns multiples?
[Query]
SELECT
e.Entity_Number,
bld.Building_Name,
cbp.PolicyId,
cbp.Description,
cbp.StartDate,
cbp.EndDate
FROM
dbo.buildings AS bld
INNER JOIN dbo.entities AS e
ON bld.Entity_ID = e.Entity_ID
INNER JOIN Collections.Building AS cbp
ON bld.Building_ID = cb.BuildingId
INNER JOIN Collections.BuildingProfile AS cbpro
ON cbp.BuildingPolicyId = cbpro.BuildingPolicyId
WHERE
bld.Building_Active = 1
AND e.Active = 1
Use the "salami technique" to isolate where the unexpected rows come from. What I mean by this is that you cut down the query like a salami by omitting each join (and any column references related to that join) one by one.
e.g. start with masking the join to Collections.BuildingProfile:
SELECT
e.Entity_Number
, bld.Building_Name
, cbp.PolicyId
, cbp.Description
, cbp.StartDate
, cbp.EndDate
FROM dbo.buildings AS bld
INNER JOIN dbo.entities AS e ON bld.Entity_ID = e.Entity_ID
INNER JOIN Collections.Building AS cbp ON bld.Building_ID = cbp.BuildingId
-- INNER JOIN Collections.BuildingProfile AS cbpro ON cbp.BuildingPolicyId = cbpro.BuildingPolicyId
WHERE bld.Building_Active = 1
AND e.Active = 1
Does this remove the unexpected columns? If not then try:
SELECT
e.Entity_Number
, bld.Building_Name
--, cbp.PolicyId
--, cbp.Description
--, cbp.StartDate
--, cbp.EndDate
FROM dbo.buildings AS bld
INNER JOIN dbo.entities AS e ON bld.Entity_ID = e.Entity_ID
--INNER JOIN Collections.Building AS cbp ON bld.Building_ID = cbp.BuildingId
--INNER JOIN Collections.BuildingProfile AS cbpro ON cbp.BuildingPolicyId = cbpro.BuildingPolicyId
WHERE bld.Building_Active = 1
AND e.Active = 1
Eventually by masking out each join (and any related column references to that table) you will discover which table is producing the unexpected multiplication of rows.
Once that table is identified I suggest you reconsider all assumptions you have made about how that table had been joined. For example, you state that " From what I understand there is only one active policy per entity." Is that really true?
Once you know where the problem starts, and you reconsider how that data should actually be used within the query, you should be closer to a solution. e.g. perhaps you need more conditions in the join, or you need to join a subquery instead of directly to the table.
Note:
Collections.BuildingProfile does not seem needed by the query, why not omit it
anyway?
reformatting for "comma first" in the select clause helps simplify use of the "salami technique"

How to include column values as null even when condition is not met?

Write a query to show ALL building names, their metering company name and meter type for all buildings that do not have postpaid meters.
The image 1 is the result that I should get and image 2 is the results that i am getting:
USE Ultimate_DataBase
GO
SELECT [Bld_Name], [Elec_company_name], [Mtype_Name]
FROM [dbo].[Metering_Company] A
FULL OUTER JOIN [dbo].[Metering_Type] D
ON A.[MType_ID]= D.MType_ID
FULL OUTER JOIN [dbo].[Building_metering] B
ON A.[Elec_ID]= B.[Elec_ID]
FULL OUTER JOIN [dbo].[Building] C
ON C.[Bld_ID]= B.[Bld_ID]
WHERE [Mtype_Name] != 'POSTPAID'
Try moving the WHERE logic to the corresponding ON clause:
SELECT [Bld_Name], [Elec_company_name], [Mtype_Name]
FROM [dbo].[Metering_Company] A
FULL OUTER JOIN [dbo].[Metering_Type] D
ON A.[MType_ID]= D.MType_ID AND
[Mtype_Name] != 'POSTPAID' -- change is here
FULL OUTER JOIN [dbo].[Building_metering] B
ON A.[Elec_ID]= B.[Elec_ID]
FULL OUTER JOIN [dbo].[Building] C
ON C.[Bld_ID]= B.[Bld_ID];
Note: Please add aliases to your select clause. They are not mandatory, assuming no two tables ever have columns by the same name, but just having aliases would have made your question easier to answer.
FULL JOIN isn't seem necessary -- in fact FULL JOIN is almost never needed, and especially not for routine JOINs in a well-structured database.
The structure of the question suggests NOT EXISTS:
SELECT b.*
FROM dbo.Building b
WHERE NOT EXISTS (SELECT 1
FROM dbo.Building_metering bm JOIN
dbo.Metering_Company mc
ON bm.Elec_ID = mc.Elec_ID JOIN
dbo.Metering_Type mt
ON mt.MType_ID = mc.MType_ID
WHERE bm.Bld_ID = b.Bld_ID AND mt.Mtype_Name = 'POSTPAID'
);
You can also express this as a LEFT JOIN and filtering:
SELECT b.*
FROM dbo.Building b LEFT JOIN
dbo.Building_metering bm
ON bm.Bld_ID = b.Bld_ID LEFT JOIN
dbo.Metering_Company mc
ON bm.Elec_ID = mc.Elec_ID LEFT JOIN
dbo.Metering_Type mt
ON mt.MType_ID = mc.MType_ID AND
mt.Mtype_Name = 'POSTPAID'
WHERE mt.MType_ID IS NULL;
This allows you to select columns from any of the tables.
Notes:
FULL JOIN is almost never needed.
Use meaningful table aliases! Arbitrary letters mean nothing. Use table abbreviations.
Escaping column and table names with square braces just makes code harder to write and to read.
USE Ultimate_DataBase
GO
SELECT [Bld_Name], [Elec_company_name], [Mtype_Name]
FROM [dbo].[Metering_Company] A
LEFT JOIN [dbo].[Metering_Type] D
ON A.[MType_ID]= D.MType_ID
LEFT JOIN [dbo].[Building_metering] B
ON A.[Elec_ID]= B.[Elec_ID]
LEFT JOIN [dbo].[Building] C
ON C.[Bld_ID]= B.[Bld_ID]
Use this

Doing a join in SQL Server

I'm trying to achieve a join where the select statement has multiple column that will reference the same name in a particular table, example:
SELECT
sh.shift_number,
sh.workplace_num,
wp.workplace_name,
sh.workplace_num2,
sh.workplace_num3
FROM shifts AS sh
INNER JOIN workplace AS wk
ON wp.workplace_num = wk.workplace_num
My problem is I'm able to get the name of the first workplace, how do i get the same for workplace2 or workplace three
Shift_number | workplace_Num | workplace_name | workplace_Num2 | workplace_Num3
4 | 2 | Teller | 3 | 4
As you can see o Wk_placename(Teller) displays the name of wk_placeNum(2) I'd like to be able to show the names of Wk_placeNum2 and Wk_placeNum3 they all take the workplace name from the joined workplace table!!
I'm restricted from uploading a picture, hopefully the illustration paints a picture!!
You have to join multiple times to the workplace table. Note that as you did not specified whether the fields workplace_num2 and workplace_num3 are nullable, I assumed these are so used LEFT JOIN. You should use INNER JOIN if these are not nullable:
SELECT
sh.shift_number,
sh.workplace_num,
wp.workplace_name,
sh.workplace_num2,
wp2.workplace_name as workplace_name2,
sh.workplace_num3
wp3.workplace_name as workplace_name3,
FROM shifts AS sh
INNER JOIN workplace AS wp
ON sh.workplace_num = wp.workplace_num
LEFT JOIN workplace AS wp2
ON sh.workplace_num2 = wp2.workplace_num
LEFT JOIN workplace AS wp3
ON sh.workplace_num3 = wp3.workplace_num
Try this:
SELECT
sh.shift_number,
sh.workplace_num,
wk.workplace_name,
wk2.workplace_name,
wk3.workplace_name
FROM shifts AS sh
INNER JOIN workplace AS wk
ON sh.workplace_num = wk.workplace_num
INNER JOIN workplace AS wk2
ON sh.workplace_num2 = wk2.workplace_num
INNER JOIN workplace AS wk3
ON sh.workplace_num3 = wk3.workplace_num

SQL Server Left Join With 'Or' Operator

I have a four tables, TopLevelParent, two mid level tables MidParentA and MidParentB, and a Child table which can have a parent of MidParentA or MidParentB (One or the other midParent must be in place). Both mid level tables have a parent table of TopLevelParent.
The Top Level table look like this:
TopLevelId | Name
--------------------------
1 | name1
2 | name2
The MidParent tables look like this:
MidParentAId | TopLevelParentId | MidParentBId | TopLevelParentId |
------------------------------------ ------------------------------------
1 | 1 | 1 | 1 |
2 | 1 | 2 | 1 |
The Child table look like this:
ChildId | MidParentAId | MidParentBId
--------------------------------
1 | 1 | NULL
2 | NULL | 2
I have used the following left join in a larger stored procedure which is timing out, and it looks like the OR operator on the last left join is the culprit:
SELECT *
FROM TopLevelParent tlp
LEFT JOIN MidParentA a ON tlp.TopLevelPatientId = a.TopLevelPatientId
LEFT JOIN MidParentB a ON tlp.TopLevelPatientId = b.TopLevelPatientId
LEFT JOIN Child c ON c.ParentAId = a.ParentAId OR c.ParentBId = b.ParentBId
Is there a more performant way to do this join?
Given how little of the query is being exposed; a very rough rule of thumb is to replace an Or with a Union to avoid table scanning.
Select..
LEFT JOIN Child c ON c.ParentAId = a.ParentAId
union
Select..
left Join Child c ON c.ParentBId = b.ParentBId
Here is what I did in the end, which got the execution time down from 52 secs to 4 secs.
SELECT *
FROM (
SELECT tpl.*, a.MidParentAId as 'MidParentId', 1 as 'IsMidParentA'
FROM TopLevelParent tpl
INNER JOIN MidParentA a ON a.TopLevelParentId = tpl.TopLevelParentID
UNION
SELECT tpl.*, b.MidParentBId as 'MidParentId', 0 as 'IsMidParentA'
FROM TopLevelParent tpl
INNER JOIN MidParentB b ON b.TopLevelParentId = tpl.TopLevelParentID
UNION
SELECT tpl.*, 0 as 'MidParentId', 0 as 'IsMidParentA'
FROM TopLevelParent tpl
WHERE tpl.TopLevelParentID NOT IN (
SELECT pa.TopLevelParentID
FROM TopLevelParent tpl
INNER JOIN MidParentA a ON a.TopLevelParentId = tpl.TopLevelParentID
UNION
SELECT pa.TopLevelParentID
FROM TopLevelParent tpl
INNER JOIN MidParentB b ON h.TopLevelParentId = tpl.TopLevelParentID
)
) tpl
LEFT JOIN MidParentA a ON a.TopLevelParentId = tpl.TopLevelParentID
LEFT JOIN MidParentB b ON b.TopLevelParentId = tpl.TopLevelParentID
LEFT JOIN
(
SELECT [ChildId]
,[MidParentAId] as 'MidParentId'
,1 as 'IsMidParentA'
FROM Child c
WHERE c.MidParentAId IS NOT NULL
UNION
SELECT [ChildId]
,[MidParentBId] as 'MidParentId'
,0 as 'IsMidParentA'
FROM Child c
WHERE c.MidParentBId IS NOT NULL
) AS c
ON c.MidParentId = tpl.MidParentId AND c.IsMidParentA = tpl.IsMidParentA
This eliminates the table scanning that was happening, as I have matched the top level record to its midlevel parent up front if it exists, and stamped it on that record.
I have also done the same with the child record meaning I can then just join the child record to the top level record on the MidParentId, and I use the IsMidParentA bit flag to differentiate where there are two identical MidParentIds (ie an Id of 1 for IsMidParentA and IsMidParentB).
Thanks to all who took the time to answer.
You should take care of using predicates inside On.
"It is very important to understand that, with outer joins, the ON and WHERE clauses play very different roles, and therefore, they aren’t interchangeable. The WHERE clause still plays a simple filtering role—namely, it keeps true cases and discards false and unknown cases. Use something like this and use predicates in where clause. However, the ON clause doesn’t play a simple filtering role; rather, it’s more a matching role. In other words, a row in the preserved side will be returned whether the ON predicate finds a match for it or not. So the ON predicate only determines which rows from the nonpreserved side get matched to rows from the preserved side—not whether to return the rows from the preserved side." **Exam 70-461: Querying Microsoft SQL Server 2012
another way to write it:
LEFT JOIN Child c ON c.ParentAId = COALESCE(a.ParentAId, b.ParentBId)
Edit
One possible approach is querying first the MidParentA and then the MidParentB and then UNION the results:
SELECT tlp.*,
a.MidParentAId,
null MidParentBId,
c.ChildId
FROM TopLevelParent tlp
LEFT JOIN MidParentA a ON tlp.TopLevelPatientId = a.TopLevelPatientId
LEFT JOIN Child c ON c.MidParentAId = a.MidParentAId
UNION
SELECT tlp.*,
null MidParentAId,
b.MidParentBId,
c.ChildId
FROM TopLevelParent tlp
LEFT JOIN MidParentB b ON tlp.TopLevelPatientId = b.TopLevelPatientId
LEFT JOIN Child c ON c.MidParentBId = b.MidParentBId
A demo in SQLFiddle
Just to add something for future observers of this answer - sometimes a UNION as described above isn't suitable as the JOIN could be in the middle of a big query that would require lots of replication. This is where an APPLY comes in handy as you could use it without needing to replicate the entire outer query, as it has access to the columns from the outer query. Note: This is in reference to SQL Server only.
SELECT *
FROM TopLevelParent tlp
LEFT JOIN MidParentA a
ON tlp.TopLevelPatientId = a.TopLevelPatientId
LEFT JOIN MidParentB a
ON tlp.TopLevelPatientId = b.TopLevelPatientId
OUTER APPLY (
SELECT * FROM Child WHERE Child.ParentAId = a.ParentAId
UNION
SELECT * FROM Child WHERE Child.ParentBId = b.ParentBId
) c

SQL joins "going up" two tables

I'm trying to create a moderately complex query with joins:
SELECT `history`.`id`,
`parts`.`type_id`,
`serialized_parts`.`serial`,
`history_actions`.`action`,
`history`.`date_added`
FROM `history_actions`, `history`
LEFT OUTER JOIN `parts` ON `parts`.`id` = `history`.`part_id`
LEFT OUTER JOIN `serialized_parts` ON `serialized_parts`.`parts_id` = `history`.`part_id`
WHERE `history_actions`.`id` = `history`.`action_id`
AND `history`.`unit_id` = '1'
ORDER BY `history`.`id` DESC
I'd like to replace `parts`.`type_id` in the SELECT statement with `part_list`.`name` where the relationship I need to enforce between the two tables is `part_list`.`id` = `parts`.`type_id`. Also I have to use joins because in some cases `history`.`part_id` may be NULL which obviously isn't a valid part id. How would I modify the query to do this?
Here is some sample date as requested:
history table:
(source: ianburris.com)
serialized_parts table:
(source: ianburris.com)
parts table:
(source: ianburris.com)
part_list table:
(source: ianburris.com)
And what I want to see is:
id name serial action date_added
4 Battery 567 added 2010-05-19 10:42:51
3 Antenna Board 345 added 2010-05-19 10:42:51
2 Main Board 123 added 2010-05-19 10:42:51
1 NULL NULL created 2010-05-19 10:42:51
This would at least be on the right track...
If you're looking to NOT show any parts with an invalid ID, simply change the LEFT JOINs to INNER JOINs (they will restrict NULL values)
SELECT `history`.`id`
, `parts`.`type_id`
, `part_list`.`name`
, `serialized_parts`.`serial`
, `history_actions`.`action`
, `history`.`date_added`
FROM `history_actions`
INNER JOIN `history` ON `history`.`action_id` = `history_actions`.`id`
LEFT JOIN `parts` ON `parts`.`id` = `history`.`part_id`
LEFT JOIN `serialized_parts` ON `serialized_parts`.`parts_id` = `history`.`part_id`
LEFT JOIN `part_list` ON `part_list`.`id` = `parts`.`type_id`
WHERE `history`.`unit_id` = '1'
ORDER BY `history`.`id` DESC
Boy, these backticks make my eyes hurt.
SELECT
h.id,
p.type_id,
pl.name,
sp.serial,
ha.action,
h.date_added
FROM
history h
INNER JOIN history_actions ha ON ha.id = h.action_id
LEFT JOIN parts p ON p.id = h.part_id
LEFT JOIN serialized_parts sp ON sp.parts_id = h.part_id
LEFT JOIN part_list pl ON pl.id = p.type_id
WHERE
h.unit_id = '1'
ORDER BY
history.id DESC