Find missing records in many-to-many junction table - sql

I have a many-to-many junction table that joins 2 tables:
table1
table1_id
-----------
1
2
3
table2
table2_id
------------
A
B
C
join_table
table1_id | table2_id
------------|--------------
1 | A
1 | B
1 | C
2 | A
2 | C
3 | B
How can I write a single query to find out what records are missing from the join table, since I want all combinations accounted for. Basically, I want a query that returns these missing records:
table1_id | table2_id
---------------------------
2 | B
3 | A
3 | C
I feel like this should be simple but I haven't figured it out.

Do a CROSS JOIN to get all possible combinations. Then use EXCEPT to remove the existing combinations.
select t1.table1_id, t2.table2_id
from t1 cross join t2
except
select table1_id, table2_id from join_table

Related

How to populate a table based on a value from a different table

I have two tables of data which I can join using a left join linked on the ID in both tables. Where the course and the person are the same, I need to populate the RegNumber as the same as the RegNumber which is already there for 1 row:
How it is currently: if I join table 1 and table 2 with a left join.
Table 1
ID | Course| Person
67705 | A | 1
68521 | A | 1
85742 | A | 1
89625 | A | 1
67857 | B | 2
86694 | B | 2
88075 | B | 2
88710 | C | 3
47924 | C | 3
66981 | C | 3
12311 | B | 1
12312 | B | 1
12313 | B | 1
Table 2
ID | RegNumber
67705 | N712316
NULL | NULL
NULL | NULL
NULL | NULL
67857 | N712338
NULL | NULL
NULL | NULL
NULL | NULL
47924 | M481035
NULL | NULL
12311 | N645525
NULL | NULL
NULL | NULL
I need table 2 to look like this:
ID | RegNumber
67705 | N712316
68521 | N712316
85742 | N712316
89625 | N712316
67857 | N712338
86694 | N712338
88075 | N712338
88710 | N712338
47924 | M481035
66981 | M481035
12311 | N645525
12312 | N645525
12313 | N645525
That is, I need to insert new rows into Table 2
Can anyone help me please? This is Totally beyond my capability!
insert into table2 (ID,RegNumber)
select t1.ID,reg.regNumber
from table1 t1
cross join (select top 1 regNumber from table2 r2 join table1 r1
on r1.Id = r2.Id
and r1.Course = t1.Course
and r1.Person = t1.person
order by id) reg
where not exists (select 1 from table2 t2 where t1.ID = t2.ID)
you can improve performance a little bit by loading data into temp table first :
select t1.ID , Course,Person,regNumber
into #LoadedData
from table1 t1
join table2 t2 on t1.Id = t2.ID
insert into table2 (ID,RegNumber)
select t1.ID,reg.regNumber
from table1 t1
cross join (select top 1 regNumber from #LoadedData l
where l.Course = t1.Course
and l.Person = t1.person
order by id) reg
where not exists (select 1 from #LoadedData l where t1.ID = l.ID)
in either case having an index on (ID, Course, Person) will help with performance
Assuming:
You are missing items in table 2 that inherit data from other records in table 1.
What makes two different IDs share the same Regnumber is to have BOTH course and person number in common.
You really need to join table 1 to itself to create the mapping that associates ID 67705 with ID 68521, then you can join in table 2 to pick up the Regnumber.
Try this:
Insert into table2 (ID,RegNumber)
Select right1.ID, left2.RegNumber
From (
(table2 left2 INNER JOIN
table1 left1 On (left1.ID=left2.ID)
INNER JOIN table1 right1 On (left1.Course=right1.Course AND left1.Person=right1.Person)
) LEFT OUTER JOIN table2 right2 On (right1.ID=right2.ID)
WHERE right2.ID Is Null
The 4th table join (alias right2) is purely defensive, to handle two records in table2 having identical Person & Course in table1.
I have solved this myself.
I concatenated the person and course columns and then joined them using that new concatenated field
insert into table 2 (ID,RegNumber)
select X1.ID,X2.Regnumber
from (select concat(course,person) as X,ID from table1) X1
join (select concat(t1.course,t1.person) as X, t2.RegNumber
from table1 t1
join table2 t2 on t1.ID = t2.ID) X2
on X1.X = X2.X
where X1.ID not in (select ID from table2)

Query table based on the value of another tables column

I have read-only access to a database and there are two tables that contain information I need. both tables have the same numbers in row a in referee to an account. I want to query the result of all accounts in table 1 that have "AD" in column B and where the account has values "4" in column C in of table 2. below is an example.
table 1 |
-------- |
A | B | |
_______ |
1 AC |
2 AD |
3 AC |
4 AD |
___________
table 2 |
-------- |
A | B | C |
__________|
1 AB 4 |
2 AB 5 |
3 AB 4 |
4 AB 4 |
I have tried the query
SELECT * FROM Table 1 WHERE column B = 'AD' and WHERE column C = '4' FROM TABLE 2
you can use inner Join instead,
Like this:
SELECT
t1.*
FROM
Table1 t1 JOIN Table2 t2 ON t1.A = t2.A
WHERE
t2.C = 4 AND t1.B = 'AD'
There isn't quite enough info here for me to help. There isn't common data between the two tables to link them.
Your query above is missing a join between the two tables, and you only need to state the where clause once, additional criteria can be added by using 'and.....'

SQL Join On Columns of Different Length

I'm trying to join two tables together in SQL where the columns contain a different number of unique entries.
When I use a full join the additional entries in the column joined on are missing.
The code I'm using is (in a SAS proc SQL):
proc sql;
create table table3 as
select table1.*, table2.*
from table1 full join table2
on table1.id = table2.id;
quit;
Visual example of problem (can't show actual tables as contain sensitive data)
Table 1
id | count1
1 | 2
2 | 3
3 | 2
Table 2
id | count2
1 | 4
2 | 5
3 | 6
4 | 2
Table 3
id | counta | countb
1 | 2 | 4
2 | 3 | 5
3 | 2 | 6
- | - | 2 <----- I want don't want the id column to be blank in this row
I hope I've explained my problem clearly enough, thanks in advance for your help.
The id from table 1 is blank because the row from table2 has no match in table 1. Try looking at the output from this query:
select coalesce(table1.id, table2.id) as id, table1.count1, table2.count2
from table1 full join table2
on table1.id = table2.id;
Coalesce works from left to right returning the first non null value (it can take more than 2 arguments). If the id in table 1 is null it uses the id from table 2 instead
I recommend also to alias all tables in queries, so I’d have written this:
SELECT
COALESCE(t1.id, t2.id) as id,
t1.count1,
t2.count2
FROM
table1 t1
FULL OUTER JOIN
table2 t2
ON
t1.id = t2.id;
Simply select coalesce(t1.id, t2.id), will return the first non-null id value.

How to do an outer join with full result between two tables

I have two tables:
TABLE1
id_attr
-------
1
2
3
TABLE2
id | id_attr | val
----------------------
10 | 1 | A
10 | 2 | B
As a result I want a table that show:
RESULT
id | id_attr | val
----------------------
10 | 1 | A
10 | 2 | B
10 | 3 | NULL
So I want the row with id=10 and id_attr=3 also when id_Attr=3 is missing in TABLE2 (and I know that because I have a NULL value (or something else) in the val column of RESULT.
NB: I could have others ids in table2. For example, after insert this row on table2: {11,1,A}, as RESULT I want:
id | id_attr | val
----------------------
10 | 1 | A
10 | 2 | B
10 | 3 | NULL
11 | 1 | A
11 | 2 | NULL
11 | 3 | NULL
So, for every id, I want always the match with all id_attr.
Your specific example only has one id, so you can use the following:
select t2.id, t2.id_attr, t2.val
from table2 t2
union all
select 10, t1.id_attr, NULL
from table1 t1
where not exists (select 1 from table2 t2 where t2.id_attr = t1.id_attr);
EDIT:
You can get all combinations of attributes and ids in the following way. Use a cross join to create all the rows you want and then a left join to bring in the data you want:
select i.id, t1.id_attr, t2.val
from (select distinct id from table2) i cross join
table1 t1 left join
table2 t2
on t2.id = i.id and t2.id_attr = t1.id_attr;
It sounds like you want to do just an outer join on id_attr instead of id.
select * from table2 t2
left outer join table1 t1 on t2.id_attr = t1.id_attr;

Delete subtable from a table, SQL

What is the most clever way to delete tuples from table1, that are in the second table,
if the second table is not a part of initial database, but a result of some really big query?
table1 *this table is a result of some query
------------- -------------
| id1 | id2 | | id1 | id2 |
------------- -------------
| 1 2 | | 5 6 |
| 3 4 | | 1 2 |
| 5 6 | | 11 12 |
| 7 8 | -------------
| 9 10 |
| 11 12 |
| 13 14 |
-------------
I came up with
delete from table1
where id1 in (select id1 from ( really long query to get a second table))
and id2 in (select id2 from (the same really long query to get a second table));
It works, but I feel like I'm doing it way too wrong, and not keeping the query DRY.
And would the way you suggest work the same if table1 had an additional column, for example "somecol"?
IMO, You can use EXISTS statement like this:
DELETE FROM table1
WHERE EXISTS (
SELECT 1
FROM (<your long query>) AS dt
WHERE table1.id1 = dt.id1
AND table1.id2 = dt.id2);
[SQL Fiddle Sample]
One method is to use with, delete and exists:
with secondtable as (
<your query here>
)
delete from table1
where exists (select 1
from secondtable st
where table1.id1 = st.id1 and table1.id2 = st.id2
);
A Correlated Subquery using EXISTS allows matching multiple columns:
delete
from table1
where exists
( select * from
(
"really long query"
) as t2
where table1.id1 = t2.id1 -- correlating inner and outer table
and table1.id2 = t2.id2 -- similar to a join-condition
)