How to LEFT JOIN 4 tables in SQL? - sql

I have 4 tables - Controls, Risks, Processes & Regulations. They each have ID common instances of ID numbers. For example (ID1 exists across the 4 tables). The problem is that under each table, the number of instances of each ID varies (for ex, ID1 exists 5 times in Controls, 3 times in Risks, 0 in Processes & once in Regulations).
I need to LEFT JOIN all these tables so they are all joined by ID number
The code below works until Line 3, but when I add Line 4, it gives me a "Resultant table not allowed to have more than one AutoNumber field" error
SELECT *
FROM Controls
LEFT JOIN Processes ON Processes.TO_PRC_ID = Controls.TO_PRC_ID
LEFT JOIN Risks ON Risks.TO_PRC_ID = Controls.TO_PRC_ID
LEFT JOIN Regulations ON Regulations.TO_PRC_ID = Controls.TO_PRC_ID

MS Access requires extra parentheses for multiple joins:
SELECT *
FROM (Controls LEFT JOIN
Processes_Risks
ON Processes_Risks.TO_PRC_ID = Controls.TO_PRC_ID
) LEFT JOIN
Issues
ON Issues.TO_PRC_ID = Controls.TO_PRC_ID
And the process continues:
SELECT *
FROM ((Controls LEFT JOIN
Processes_Risks
ON Processes_Risks.TO_PRC_ID = Controls.TO_PRC_ID
) LEFT JOIN
Issues
ON Issues.TO_PRC_ID = Controls.TO_PRC_ID
) LEFT JOIN
Regulations
ON . . .

You have two or more table with the same column name so try use full qualified column name in select
SELECT c.TO_PRC_ID, p.TO_PRC_ID, r1.TO_PRC_ID, r2.TO_PRC_ID
FROM Controls c
LEFT JOIN Processes ON p p.TO_PRC_ID = c.TO_PRC_ID
LEFT JOIN Risks r1 ON r1.TO_PRC_ID = c.TO_PRC_ID
LEFT JOIN Regulations r2 ON r2.TO_PRC_ID = c.TO_PRC_ID

There are two different problems here. One problem is getting the right syntax for joining four tables. The other problem is the error message "Resultant table not allowed to have more than one AutoNumber field".
I don't have a copy of the tables being joined, but I suspect that more than one of them has an AutoNumber field in it. This is a field that automatically generates a record number when a new record is added to a table. Because the left join includes all of the fields in all of the tables, it will eventually include two different AutoNumber fields. MS Access cannot cope with that situation, so it declares there to be an error.
The proper though difficult way to deal with removing an AutoNumber field from a join is to list all of the other fields instead. So, instead of
FROM CONTROLS
one would need to code
FROM (SELECT A, B, C, D, WHATEVER FROM CONTROLS)
to eliminate the problem field.
If the tables have many fields, this becomes tedious to code. One alternative is to copy a table into a temporary table, drop the AutoNumber field from the copy, and use the copy instead of the original in the join. Whether this is a good or bad idea depends on the circumstances, such as how large the tables are, how often this would need to be done, and whether there is a good way to clean up the temporary tables later.

Related

Abap subquery Where Cond [duplicate]

I have a requirement to pull records, that do not have history in an archive table. 2 Fields of 1 record need to be checked for in the archive.
In technical sense my requirement is a left join where right side is 'null' (a.k.a. an excluding join), which in abap openSQL is commonly implemented like this (for my scenario anyways):
Select * from xxxx //xxxx is a result for a multiple table join
where xxxx~key not in (select key from archive_table where [conditions] )
and xxxx~foreign_key not in (select key from archive_table where [conditions] )
Those 2 fields are also checked against 2 more tables, so that would mean a total of 6 subqueries.
Database engines that I have worked with previously usually had some methods to deal with such problems (such as excluding join or outer apply).
For this particular case I will be trying to use ABAP logic with 'for all entries', but I would still like to know if it is possible to use results of a sub-query to check more than than 1 field or use another form of excluding join logic on multiple fields using SQL (without involving application server).
I have tested quite a few variations of sub-queries in the life-cycle of the program I was making. NOT EXISTS with multiple field check (shortened example below) to exclude based on 2 keys works in certain cases.
Performance acceptable (processing time is about 5 seconds), although, it's noticeably slower than the same query when excluding based on 1 field.
Select * from xxxx //xxxx is a result for a multiple table inner joins and 1 left join ( 1-* relation )
where NOT EXISTS (
select key from archive_table
where key = xxxx~key OR key = XXXX-foreign_key
)
EDIT:
With changing requirements (for more filtering) a lot has changed, so I figured I would update this. The construct I marked as XXXX in my example contained a single left join ( where main to secondary table relation is 1-* ) and it appeared relatively fast.
This is where context becomes helpful for understanding the problem:
Initial requirement: pull all vendors, without financial records in 3
tables.
Additional requirements: also exclude based on alternative
payers (1-* relationship). This is what example above is based on.
More requirements: also exclude based on alternative payee (*-* relationship between payer and payee).
Many-to-many join exponentially increased the record count within the construct I labeled XXXX, which in turn produces a lot of unnecessary work. For instance: a single customer with 3 payers, and 3 payees produced 9 rows, with a total of 27 fields to check (3 per row), when in reality there are only 7 unique values.
At this point, moving left-joined tables from main query into sub-queries and splitting them gave significantly better performance.
than any smarter looking alternatives.
select * from lfa1 inner join lfb1
where
( lfa1~lifnr not in ( select lifnr from bsik where bsik~lifnr = lfa1~lifnr )
and lfa1~lifnr not in ( select wyt3~lifnr from wyt3 inner join t024e on wyt3~ekorg = t024e~ekorg and wyt3~lifnr <> wyt3~lifn2
inner join bsik on bsik~lifnr = wyt3~lifn2 where wyt3~lifnr = lfa1~lifnr and t024e~bukrs = lfb1~bukrs )
and lfa1~lifnr not in ( select lfza~lifnr from lfza inner join bsik on bsik~lifnr = lfza~empfk where lfza~lifnr = lfa1~lifnr )
)
and [3 more sets of sub queries like the 3 above, just checking different tables].
My Conclusion:
When exclusion is based on a single field, both not in/not exits work. One might be better than the other, depending on filters you use.
When exclusion is based on 2 or more fields and you don't have many-to-many join in main query, not exists ( select .. from table where id = a.id or id = b.id or... ) appears to be the best.
The moment your exclusion criteria implements a many-to-many relationship within your main query, I would recommend looking for an optimal way to implement multiple sub-queries instead (even having a sub-query for each key-table combination will perform better than a many-to-many join with 1 good sub-query, that looks good).
Anyways, any additional insight into this is welcome.
EDIT2: Although it's slightly off topic, given how my question was about sub-queries, I figured I would post an update. After over a year I had to revisit the solution I worked on to expand it. I learned that proper excluding join works. I just failed horribly at implementing it the first time.
select header~key
from headers left join items on headers~key = items~key
where items~key is null
if it is possible to use results of a sub-query to check more than
than 1 field or use another form of excluding join logic on multiple
fields
No, it is not possible to check two columns in subquery, as SAP Help clearly says:
The clauses in the subquery subquery_clauses must constitute a scalar
subquery.
Scalar is keyword here, i.e. it should return exactly one column.
Your subquery can have multi-column key, and such syntax is completely legit:
SELECT planetype, seatsmax
FROM saplane AS plane
WHERE seatsmax < #wa-seatsmax AND
seatsmax >= ALL ( SELECT seatsocc
FROM sflight
WHERE carrid = #wa-carrid AND
connid = #wa-connid )
however you say that these two fields should be checked against different tables
Those 2 fields are also checked against two more tables
so it's not the case for you. Your only choice seems to be multi-join.
P.S. FOR ALL ENTRIES does not support negation logic, you cannot just use some sort of NOT IN FOR ALL ENTRIES, it won't be that easy.

Access SQL subquery access field from parent

I have a SQL query that works in Access 2016:
SELECT
Count(*) AS total_tests,
Sum(IIf(score>=securing_threshold And score<mastering_threshold,1,0)) AS total_securing,
Sum(IIf(score>=mastering_threshold,1,0)) AS total_mastering,
total_securing/Count(*) AS percent_securing,
total_mastering/Count(*) AS percent_mastering,
(Count(*)-total_securing-total_mastering)/Count(*) AS percent_below,
subjects.subject,
students.year_entered,
IIf(Month(Date())<9,Year(Date())-students.year_entered+6,Year(Date())-students.year_entered+7) AS current_form,
groups.group
FROM
((subjects
INNER JOIN tests ON subjects.ID = tests.subject)
INNER JOIN (students
INNER JOIN test_results ON students.ID = test_results.student) ON tests.ID = test_results.test)
LEFT JOIN
(SELECT * FROM group_membership LEFT JOIN groups ON group_membership.group = groups.ID) As g
ON students.ID = g.student
GROUP BY subjects.subject, students.year_entered, groups.group;
However, I wish to filter out irrelevant groups before joining them to my table. The table groups has a column subject which is a foreign key.
When I try changing ON students.ID = g.student to ON students.ID = g.student And subjects.ID = g.subject I get the error 'JOIN expression not supported'.
Alternatively, when I try adding WHERE subjects.ID = groups.subject to the subquery, it asks me for the parameter value of subjects.ID, although it is a column in the parent query.
Googling reveals many similar errors but they were all resolved by changing the brackets. That didn't help.
Just in case the table relationships help:
Thank you.
EDIT: Sample database at https://www.dropbox.com/s/yh80oooem6gsni7/student%20tracker.ACCDB?dl=0
MS Access queries with many joins are difficult to update by SQL alone as parenetheses pairings are required unlike other RDBMS's and these pairings must follow an order. Moreover, some pairings can even be nested. Hence, for beginners it is advised to build queries with complex and many joins using the query design GUI in the MS Office application and let it build out the SQL.
For a simple filter on the g derived table, you could filter subject on the derived table, g, but likely you want want all subjects:
...
(SELECT * FROM group_membership
LEFT JOIN groups ON group_membership.group = groups.ID
WHERE groups.subject='Earth Science') As g
...
So for all subjects, consider re-building query from scratch in GUI that nearly mirrors your table relationships which actually auto-links joins in the GUI. Then, drop unneeded tables.
Usually you want to begin with the join table or set like groups and group_membership or tests and test_results. In fact, consider saving the g derived table as its own query.
Then add the distinct record primary source tables like students and subjects.
You may even need to play around with order in FROM and JOIN clauses to attain desired results, and maybe even add the same table in query. And be careful with adding join tables like group_membership (two one-to-many links), to GROUP BY queries as it leads to the duplicate record aggregation. So you may need to join aggregates queries by subject.
Unless you can post content of all tables, from our perspective it is difficult to help from here.
Your subquery g uses a LEFT JOIN, but there is a enforced 1:n relation between the two tables, so there will always be a matching group. Use a INNER JOIN instead.
With g.subject you are trying to join on a column that is on the right side of a left join, that cannot really work.
Also you shouldn't use SELECT * on a join of tables with identical column names. Include only the qualified column names that you need.
LEFT JOIN
(SELECT group_membership.student, groups.group, groups.subject
FROM group_membership INNER JOIN groups
ON group_membership.group = groups.ID) As g
ON (students.ID = g.student AND subjects.ID = g.subject)
I would call the columns in group_membership group_ID and student_ID to avoid confusion.
I don't have the database to test, but I would use subject table as subquery:
(SELECT * FROM subject WHERE filter out what you don't need) Subj
Then INNER JOIN this new Subj Table in your query which would exclude irrelevant groups.
Also I would never create join in WHERE clause (WHERE subjects.ID = groups.subject), what this does it creates cartesian product (table with all the possible combinations of subjects.ID and groups.subject) then it filters out records to satisfy your join. When dealing with huge data it might take forever or crash.
Error related to "Join expression may not be supported"; do datatypes match in those fields?
I solved it by (a lot of trial and error and) taking the advice here to make queries in the GUI and joining them. The end result is 4 queries deep! If the database was bigger, performance would be awful... but now its working I can tweak it eventually.
Thank you everybody!

Contradiction Between Multiple Left Joins

I am trying to understand the following query which is automatically produced by some software library:
SELECT DISTINCT `t`.* FROM `teacher` AS `t`
LEFT JOIN `rel` AS `rel_profile`
ON `rel_profile`.`field_id` = 2319 AND `rel_profile`.`item_id` = `t`.`id`
LEFT JOIN `teacher_info` AS `profile`
ON `profile`.`id` = `rel_profile`.`related_item_id`
LEFT JOIN `rel` AS `rel_profile_city`
ON `rel_profile_city`.`field_id` = 2320 AND `rel_profile_city`.`item_id` = `profile`.`id` WHERE `rel_profile_city`.`item_id` = 1
There are three left joins. I understand the first and second one. What I don't understand is the third left join:
LEFT JOIN `rel` AS `rel_profile_city`
ON `rel_profile_city`.`field_id` = 2320 AND `rel_profile_city`.`item_id` = `profile`.`id` WHERE `rel_profile_city`.`item_id` = 1
The table rel has already been used in the first left join:
LEFT JOIN `rel` AS `rel_profile`
ON `rel_profile`.`field_id` = 2319
Now, the same table is left joined again but this time the value of the joined field is different:
LEFT JOIN `rel` AS `rel_profile_city`
ON `rel_profile_city`.`field_id` = 2320
How come that these two joins do not contradict?
The query is using aliases:
`rel` AS `rel_profile`
Says to pretend that the table rel is actually a table called rel_profile. That alias is then used throughout the rest of the query. I'm not sure of MySQL, but on some other database systems, it's an error to refer to the table as rel from then onwards(*) (unless there's another join that re-introduces the table and doesn't provide an alias).
And joining to the same table multiple times is allowed - provided that the names (or aliases) are unique. This is useful when you're trying to construct a result that relies on the content of multiple rows from the same table, where the result should occupy a single row.
(*) "Then onwards" being in the order in which the clauses are processed, not the text order. E.g. you should use the alias in the SELECT clause because, even though it occurs earlier textually, it's (conceptually) processed after the FROM clause.
This query will show teacher rows that have associated rows in rel with field_id = 2319 OR field_id = 2320
The are not "contradicting" each other. Imagine you have a table of users, wich have the demographic and personal data of your users. And another table with the "relation" between users. So, in this "relations" table, you have columns UserId1 and UserId2. If you want a query that returns the data of those two users, you'll need to do two JOINS with the table Users, once per each User column. This doesn't mean that they are contradicting each other.

Successive LEFT SQL joins back to original

I need to redo sql statement in legacy Foxpro application and don't understand whether it is meaningful at all. Syntax is a bit specific - it extracts data from temporary table into the same temporary table ( overwriting) with some joins.
SELECT aa.*,b.spa_date FROM (ALIAS()) aa INNER JOIN jobs ON aa.seq=jobs.seq ;
LEFT JOIN job2 ON jobs.job_no=job2.rucjob;
left join jobs b on b.job_no=job2.job_no;
WHERE jobs.qty1<>0 INTO CURSOR (ALIAS())
Since only one field is added from joined tables ( spa_date ) is there any point in 2 left joins or I am missing something. Isn't it equivalent to
SELECT aa.*,jobs.spa_date FROM (ALIAS()) aa INNER JOIN jobs ON aa.seq=jobs.seq ;
WHERE jobs.qty1<>0 INTO CURSOR (ALIAS())
They are different because b.spa_date come from the second left join. You may be missing filtered rows without both left joins.
You would need to know the intent of the original query and perhaps rewrite it to make more sense but I'd say the two queries are different.

Why does my left join in Access have fewer rows than the left table?

I have two tables in an MS Access 2010 database: TBLIndividuals and TblIndividualsUpdates. They have a lot of the same data, but the primary key may not be the same for a given person's record in both tables. So I'm doing a join between the two tables on names and birthdates to see which records correspond. I'm using a left join so that I also get rows for the people who are in TblIndividualsUpdates but not in TBLIndividuals. That way I know which records need to be added to TBLIndividuals to get it up to date.
SELECT TblIndividuals.PersonID AS OldID,
TblIndividualsUpdates.PersonID AS UpdateID
FROM TblIndividualsUpdates LEFT JOIN TblIndividuals
ON ( (TblIndividuals.FirstName = TblIndividualsUpdates.FirstName)
and (TblIndividuals.LastName = TblIndividualsUpdates.LastName)
AND (TblIndividuals.DateBorn = TblIndividualsUpdates.DateBorn
or (TblIndividuals.DateBorn is null
and (TblIndividuals.MidName is null and TblIndividualsUpdates.MidName is null
or TblIndividuals.MidName = TblIndividualsUpdates.MidName))));
TblIndividualsUpdates has 4149 rows, but the query returns only 4103 rows. There are about 50 new records in TblIndividualsUpdates, but only 4 rows in the query result where OldID is null.
If I export the data from Access to PostgreSQL and run the same query there, I get all 4149 rows.
Is this a bug in Access? Is there a difference between Access's left join semantics and PostgreSQL's? Is my database corrupted (Compact and Repair doesn't help)?
ON (
TblIndividuals.FirstName = TblIndividualsUpdates.FirstName
and
TblIndividuals.LastName = TblIndividualsUpdates.LastName
AND (
TblIndividuals.DateBorn = TblIndividualsUpdates.DateBorn
or
(
TblIndividuals.DateBorn is null
and
(
TblIndividuals.MidName is null
and TblIndividualsUpdates.MidName is null
or TblIndividuals.MidName = TblIndividualsUpdates.MidName
)
)
)
);
What I would do is systematically remove all the join conditions except the first two until you find the records drop off. Then you will know where your problem is.
This should never happen. Unless rows are being inserted/deleted in the meantime,
the query:
SELECT *
FROM a LEFT JOIN b
ON whatever ;
should never return less rows than:
SELECT *
FROM a ;
If it happens, it's a bug. Are you sure the queries are exactly like this (and you have't omitted some detail, like a WHERE clause)? Are you sure that the first returns 4149 rows and the second one 4103 rows? You could make another check by changing the * above to COUNT(*).
Drop any indexes from both tables which include those JOIN fields (FirstName, LastName, and DateBorn). Then see whether you get the expected
4,149 rows with this simplified query.
SELECT
i.PersonID AS OldID,
u.PersonID AS UpdateID
FROM
TblIndividualsUpdates AS u
LEFT JOIN TblIndividuals AS i
ON
(
(i.FirstName = u.FirstName)
AND (i.LastName = u.LastName)
AND (i.DateBorn = u.DateBorn)
);
For whatever it is worth, since this seems to be a deceitful bug and any additional information could help resolving it, I have had the same problem.
The query is too big to post here and I don't have the time to reduce it now to something suitable, but I can report what I found. In the below, all joins are left joins.
I was gradually refining and changing my query. It had a derived table in it (D). And the whole thing was made into a derived table (T) and then joined to a last table (L). In any case, at one point in its development, no field in T that originated in D participated in the join to L. It was then the problem occurred, the total number of rows mysteriously became less than the main table, which should be impossible. As soon as I again let a field from D participate (via T) in the join to L, the number increased to normal again.
It was as if the join condition to D was moved to a WHERE clause when no field in it was participating (via T) in the join to L. But I don't really know what the explanation is.