Comparing 3 SQL Tables - sql

I am trying to compare 3 tables with 1 being the base table.
So here are my 3 table(s) where Table 1 is the base table and the other two are compared with each other.
Table1
ID | ChargeItem
-----------------
5055 | Item1
5056 | Item2
5057 | Item3
5058 | Item4
5059 | Item5
5060 | Item6
5061 | Item7
5062 | Item8
5063 | Item9
5064 | Item10
5065 | Item11
Table2
ID | membershiprecordid | ChargeItemID | Status
-----------------------------------------------
1 | 268765 | 5060 | 1
2 | 268765 | 5060 | 1
Table3
ID | ChargeItemID
--------------------
12146 | 5058
12146 | 5060
12146 | 5062
12146 | 5063
12146 | 5065
Here is my SQL query so far
SELECT Table1.ID
FROM Table1 as T1
WHERE T1.ID NOT in (
select Table2.chargeitemid from Table2 as T2
right join Table3 as T3 on T2.chargeitemid = T3.chargeitemid
where T2.membershiprecordid = 268765 AND T2.[Status] = 2
)
So In the SQL query I am trying to get back IDs from Table 1 where it doesn't exist in Table 2 and Table 3. And Inside my sub query I compare Table 2 and Table 3 where by Table 2 taking priority over Table 3 so if a ChargeItemID exists in Table 2 with Status = 2 then fetch it and return its ID along with the ID's in the Table1
Currently it doesn't return any ID's for Table 1? Any suggestions as to why?
The Result should be the following ChargeItem ID's returned from Table1
5055,
5056,
5057,
5059,
5061,
5064
Hopefully it explains my issue?
Thanks
UPDATE
Please ignore T2.ClubID = 1600 it was posted in error
UPDATE 2
Expected result from the query

Try using below query to find out id's which are present on Table1 and not present on Table2 & table 3.
CREATE TABLE TABLE1 (ID INT);
INSERT INTO TABLE1 VALUES (5055), (5056), (5060), (5065), (5057);
CREATE TABLE TABLE2 (ID INT, CHARGEITEMID INT, STATUS INT)
INSERT INTO TABLE2 VALUES (1, 5060,1)
INSERT INTO TABLE2 VALUES (2, 5065,1)
INSERT INTO TABLE2 VALUES (2, 5056,2)
CREATE TABLE TABLE3 (ID INT, CHARGEITEMID INT )
INSERT INTO TABLE3 VALUES (1, 5058)
INSERT INTO TABLE3 VALUES (1, 5060)
INSERT INTO TABLE3 VALUES (1, 5062)
INSERT INTO TABLE3 VALUES (1, 5063)
INSERT INTO TABLE3 VALUES (1, 5065)
INSERT INTO TABLE3 VALUES (1, 5056)
select * from TABLE1
select * from TABLE2
select * from TABLE3
select id
from TABLE1
except
(
select CHARGEITEMID from TABLE2
intersect
select CHARGEITEMID from TABLE3
)

Can you see if this accomplish your request (Maybe I didn't understood it very well):
SELECT T1.ID
FROM Table1 as T1
LEFT JOIN (select DISTINCT T3.chargeitemid, T2.STATUS
from Table2 as T2
right join Table3 as T3 on T2.chargeitemid = T3.chargeitemid
) T4 ON T1.ID = T4.CHARGEITEMID
WHERE T4.chargeitemid IS NULL
OR T4.Status = 2
Sample data:
CREATE TABLE TABLE1 (ID INT);
INSERT INTO TABLE1 VALUES (5055), (5056), (5060), (5065), (5057);
CREATE TABLE TABLE2 (ID INT, CHARGEITEMID INT, STATUS INT)
INSERT INTO TABLE2 VALUES (1, 5060,1)
INSERT INTO TABLE2 VALUES (2, 5065,1)
INSERT INTO TABLE2 VALUES (2, 5056,2)
CREATE TABLE TABLE3 (ID INT, CHARGEITEMID INT )
INSERT INTO TABLE3 VALUES (1, 5058)
INSERT INTO TABLE3 VALUES (1, 5060)
INSERT INTO TABLE3 VALUES (1, 5062)
INSERT INTO TABLE3 VALUES (1, 5063)
INSERT INTO TABLE3 VALUES (1, 5065)
INSERT INTO TABLE3 VALUES (1, 5056)
Output:
ID
-----------
5055
5056
5057

A few things:
You mention: "where by Table 2 taking priority over Table 3." If you are performing a RIGHT JOIN, you are essentially saying that Table 3's rows should be shown regardless of whether there is a match in Table 2. Here, "RIGHT" indicates the table you are joining to, which is Table 3.
When you include a WHERE clause and a RIGHT JOIN, you are essentially negating the RIGHT JOIN and turning this query into an INNER JOIN. What does this mean? Your query will JOIN on the specified columns, then filter the entire search set to match your WHERE clause.
If you want all columns from Table 2 to be shown when they match the WHERE clause credentials regardless of whether the table can be joined to Table 3, you'll want to change your sub-query to:
SELECT Table2.chargeitemid from Table2 as T2
LEFT JOIN Table3 as T3
ON T2.chargeitemid = T3.chargeitemid
AND T2.clubid = 1600
AND T2.membershiprecordid = 268765
AND T2.[Status] = 2
This query will return all columns from Table 2 where the (T2.clubid = 1600 ...) are met and only show columns from table 3 if the chargeitemid is matched. This would accomplish "Table 2 taking priority over Table 3."
In addition, does your sub-query (the query inside WHERE T1.ID NOT in) return the expected results? I can update this post as needed.

You can do that with a left join, that will preserve all rows from Table1 and give null to the columns of Table2 and Table3 that are not joined. This way you only need to filter on those fields being null.
select t1.id
from Table1 t1
left join
Table2 t2
on t1.ID = t2.ChargeItemID and
t2.membershiprecordid = 268765 and
t2.[Status] = 2
left join
Table3 t3
on t1.ID = t3.ChargeItemID
where t2.ChargeItemID is null or
t3.ChargeItemID is null
If, as per #BeanFrog comment, you want the IDs that are not available on neither Table2 nor Table3, you can just replace OR with AND in the where clause.
Edit
Seems I misunderstood the requirements, this one should do the trick
select t1.id
from Table1 t1
left join
Table3 t3
on t1.ID = t3.ChargeItemID
left join
Table2 t2
on t3.ChargeItemID = t2.ChargeItemID
where t3.ChargeItemID is null or (
coalesce(t2.membershiprecordid, -1) = 268765 and
coalesce(t2.[Status], -1) = 2
);
This will return the IDs that are not in t3 or that are in both t3 and t2 (with the t2 filters applied); you can see it in action here

try to use not exists, like this
select T1.ID as cid
from Table1 as T1
where not exists (select 1 from Table2 as T2 where T2.chargeitemid = T1.ID)
and not exists
(select 1 from Table3 as T3 where T3.chargeitemid = T1.ID)
union all
select T2.chargeitemid as cid
from Table2 as T2
right join Table3 as T3 on T2.chargeitemid = T3.chargeitemid
where T2.membershiprecordid = 268765
and T2.Status = 2

Finally got this resolved by Using SQL Except as shown below. So in the query it queries the master table first and gets all the id's. Then I compare the results against the results from table 2 and table 3 to get the final list.
SELECT id FROM table1
except
SELECT chargeitemid FROM table3
except
SELECT chargeitemid FROM table2 WHERE membershiprecordid = 268765 AND status = 2

Related

SQL query returning all rows from Table2

I am trying to join 2 tables and return data if the Table1.codeId is presented on Table2 OR if Table1.codeId = 0. However, It retrieves the data from Table2.
Table1 {
name nvarchar,
codeId int
}
| name | codeId |
|--------|--------|
| Bob | 1 |
| Bob | 2 |
| Chris | 0 |
Table2 {
id int,
codeName nvarchar
}
| id | codeName |
|------|----------|
| 1 | Engineer |
| 2 | Doctor |
| 3 | Dentist |
| 4 | Pilot |
| 5 | Mechanic |
SELECT t1.name, t2.codeName
FROM dbo.Table1 t1, dbo.Table2 t2
WHERE (t1.codeId = t2.id OR t1.codeId = 0)
Expected result:
Bob, 1
John, 2
Chris, 0
You are not required to use Join at all for such condition.
You can use subquery as following, it return same result as your expectation
select name,codeid from table1 where codeid in (select id from table2)
or codeid=0
What if you do it in two separates queries ?
Looking at the outcome, the problem must come from the WHERE clause. The OR seem to always be triggered.
So maybe splitting could do it
SELECT t1.name, t2.codeName
FROM dbo.Table1 t1, dbo.Table2 t2
WHERE (t1.codeId = t2.id)
SELECT t1.name, t2.codeName
FROM dbo.Table1 t1, dbo.Table2 t2
WHERE (t1.codeId = 0)
You can use a left join. Use it to select where there is a code match in Table2 or the code_id is 0.
create table Table1
(
name nvarchar(50),
codeId int
)
create table Table2
(
id int,
codeName nvarchar(50)
)
insert into Table1
VALUES
('Bob', 1),
('John', 2),
('Chris', 0),
('Tom', -1)
-- This should be excluded .. since -1 code doesn't exist in Table2
insert into Table2
VALUES
(1, 'Engineer'),
(2, 'Doctor'),
(3, 'Dentist'),
(4, 'Pilot'),
(5, 'Mechanic')
SELECT t1.name, t1.codeId
FROM dbo.Table1 t1
LEFT JOIN dbo.Table2 t2 ON t1.codeId = t2.id
WHERE t2.id is not NULL or t1.codeId = 0
You have to use left outer join.
please find below query
Select codeid,name
FROM Table1
LEFT OUTER JOIN Table2
ON Table1.codeId=Table2.id;

SQL - Finding optional value in another table

In the scenario where there are two tables, one column in the first has a nullable key to another table.
table1_id | table1_key | table2_id | table2_value
----------+------------+-----------+--------------
1 | 1 | 1 | 3
2 | | |
3 | 3 | 3 | 1
4 | 1 | 1 | 3
With a single efficient statement, I want to get all rows from table1 and data from table2 if they exist.
My current method does a union between two statements.
SELECT
table1.id as table1_id,
table1.fkey as table1_key,
table2.id as table2_id,
table2.value as table_value
FROM
table1,
table2
WHERE
table1.fkey = table2.id
UNION
SELECT
table1.id as table1_id,
null,
null,
null
FROM
table1,
table2
WHERE
table1.fkey = NOT IN (SELECT id FROM table2)
How can this be done more efficiently in a single select statement?
A left join would do the job,
SELECT
table1.id as table1_id,
table1.fkey as table1_key,
table2.id as table2_id,
table2.value as table_value
FROM table1
LEFT OUTER JOIN table2
ON table1.fkey = table2.id
You need a join between table1 and table2 on the foreign key relationship.
From your question, I understand that column fkey in table1 is a foreign key to column id in table2.
You want to retrieve rows from table1 even if there is no matching row in table2. Hence you need a left outer join
select t1.id as t1_id
,t1.fkey as t2_id
,t2.value as t2_value
from table1 t1
left outer join table2 t2
on t1.fkey = t2.id

Find values where related must have list of values

I'm trying to find a simple solution for my SQL Server problem.
I have two tables look like this:
table1
--id
-- data
table2
--id
--table1_id
--value
I have some records like this:
Table1
+-----------------------+
| id | data |
+-----------------------+
| 1 | ? |
+-----------------------+
| 2 | ? |
+-----------------------+
Table2
+-----------------------+
|id | table1_id | value |
+-----------------------+
| 1 | 1 | 'a' |
+-----------------------+
| 2 | 1 | 'b' |
+-----------------------+
| 3 | 2 | 'a' |
+-----------------------+
Now I want to get table1 with all it's additional values where the relation to table2 has 'a' AND 'b' as values.
So I would get the id 1 of table1.
Currently I have an query like this:
SELECT t1.[id], t1.[data]
FROM [table1] t1,
(SELECT [id]
FROM [table1] t1
JOIN [table2] t2 ON t1.[id] = t2.[table1_id] AND t2.[Value] IN('a', 'b')
GROUP BY t1[id]
HAVING COUNT(t2.[Value]) = 2) x
WHERE t1.id = x.id
Has anyone an idea on how to achieve my goal in a simpler way?
One way uses exists:
select t1.*
from table1 t1
where exists (select 1
from table2 t2
where t2.table1_id = t1.id and t2.value = 'a'
) and
exists (select 1
from table2 t2
where t2.table1_id = t1.id and t2.value = 'b'
);
This can take advantage of an index on table2(table1_id, value).
You could also write:
select t1.*
from table1 t1
where (select count(distinct t2.value)
from table2 t2
where t2.table1_id = t1.id and t2.value in ('a', 'b')
) = 2 ;
This would probably also have very good performance with the index, if table2 doesn't have duplicates.
SELECT T1.[id], T1.[data]
FROM table1 AS T1
JOIN table2 AS T2
ON T1.[id]=T2.[table1_id]
JOIN table2 AS T3
ON T1.[id]=T3.[table1_id]
WHERE
T2.[Value] ='a'
AND T3.[Value] = 'b'
As Gordon Linoff suggested, exists clause usage works as well and could be performance efficient depending on the data you are playing with.
you have to do several steps to solve the problem:
established which records are related to table 1 and table 2 and which of these are of value (A or B) and eliminate the repeated ones with the group by(InfoRelationate )
validate that only those related to a and b were allowed by means of a count in the table above (ValidateAYB)
see what data meets the condition of table1 and table 2 and joined table 1
this query meets the conditions
with InfoRelationate as
(
select Table2.table1_id,value
from Table2 inner join
Table1 on Table2.table1_id=Table1.id and Table2.value IN('a', 'b')
group by Table2.table1_id,value
),
ValidateAYB as
(
select InfoRelationate.table1_id
from InfoRelationate
group by InfoRelationate.table1_id
having count (1)=2
)
select InfoRelationate.table1_id,InfoRelationate.value
from InfoRelationate
inner join ValidateAYB on InfoRelationate.table1_id=ValidateAYB.table1_id
union all
select id,data
from Table1
Example code

Using the same table alias twice in a query

My coworker, who is new to ANSI join syntax, recently wrote a query like this:
SELECT count(*)
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c);
Note that table3 is joined to both table1 and table2 on different columns, but the two JOIN clauses use the same table alias for table3.
The query runs, but I'm unsure of it's validity. Is this a valid way of writing this query?
I thought the join should be like this:
SELECT count(*)
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c);
Are the two versions functionally identical? I don't really have enough data in our database yet to be sure.
Thanks.
The first query is a join of 4 tables, the second one is a join of 3 tables. So I don't expect that both queries return the same numbers of rows.
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c);
The alias t3 is only used in the ON clause. The alias t3 refers to the table before the ON keyword. I found this out by experimenting. So the pervious query is equvivalent to
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t4 ON
(t4.col_c = t1.col_c);
and this can be transfotmed in a traditional join
SELECT *
FROM table1 t1,
table2 t2,
table3 t3,
table3 t4
where (t1.col_a = t2.col_a)
and (t2.col_b = t3.col_b)
and (t4.col_c = t1.col_c);
The second query is
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c);
This can also transformed in a traditional join
SELECT *
FROM table1 t1,
table2 t2,
table3 t3
where (t1.col_a = t2.col_a)
and (t2.col_b = t3.col_b)
AND (t3.col_c = t1.col_c);
These queries seem to be different. To proof their difference we use the following example:
create table table1(
col_a number,
col_c number
);
create table table2(
col_a number,
col_b number
);
create table table3(
col_b number,
col_c number
);
insert into table1(col_a, col_c) values(1,3);
insert into table1(col_a, col_c) values(4,3);
insert into table2(col_a, col_b) values(1,2);
insert into table2(col_a, col_b) values(4,2);
insert into table3(col_b, col_c) values(2,3);
insert into table3(col_b, col_c) values(2,5);
insert into table3(col_b, col_c) values(7,9);
commit;
We get the following output
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c)
| COL_A | COL_C | COL_A | COL_B | COL_B | COL_C | COL_B | COL_C |
|-------|-------|-------|-------|-------|-------|-------|-------|
| 1 | 3 | 1 | 2 | 2 | 3 | 2 | 3 |
| 4 | 3 | 4 | 2 | 2 | 3 | 2 | 3 |
| 1 | 3 | 1 | 2 | 2 | 5 | 2 | 3 |
| 4 | 3 | 4 | 2 | 2 | 5 | 2 | 3 |
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c)
| COL_A | COL_C | COL_A | COL_B | COL_B | COL_C |
|-------|-------|-------|-------|-------|-------|
| 4 | 3 | 4 | 2 | 2 | 3 |
| 1 | 3 | 1 | 2 | 2 | 3 |
The number of rows retrieved is different and so count(*) is different.
The usage of the aliases was surprising. at least for me.
The following query works because t1 in the where_clause references table2.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_b<0;
The following query works because t1 in the where_clause references table1.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_c<0;
The following query raises an error because both table1 and table2 contain a column col_a.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_a<0;
The error thrown is
ORA-00918: column ambiguously defined
The following query works, the alias t1 refers to two different tables in the same where_clause.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_b<0 and t1.col_c<0;
These and more examples can be found here: http://sqlfiddle.com/#!4/84feb/12
The smallest counter example
The smallest counter example is
table1
col_a col_c
1 2
table2
col_a col_b
1 3
table3
col_b col_c
3 5
6 2
Here the second query has an empty result set and the first query returns one row. It can be shown that the count(*) of the second query never exeeds the count(*)of the first query.
A more detailed explanation
This behaviour will became more clear if we analyze the following statement in detail.
SELECT t.col_b, t.col_c
FROM table1 t
JOIN table2 t ON
(t.col_b = t.col_c) ;
Here is the reduced syntax for this query in Backus–Naur form derived from the syntax descriptions in the SQL Language Reference of Oracle 12.2. Note that under each syntax diagram there is a link to the Backus–Naur form of this diagram, e.g Description of the illustration select.eps. "reduced" means that I left out all the possibilities that where not used, e,g. the select is defined as
select::=subquery [ for_update_clause ] ;
Our query does not use the optional for_update_clause, so I reduced the rule to
select::=subquery
The only exemption is the optional where-clause. I didn't remove it so that this reduced rules can be used to analyze the above query even if we add a where_clause.
These reduced rule will define only a subset of all possible select statements.
select::=subquery
subquery::=query_block
query_block::=SELECT select_list FROM join_clause [ where_clause ]
join_clause::=table_reference inner_cross_join_clause ...
table_reference::=query_table_expression t_alias query_table_expression::=table
inner_cross_join_clause::=JOIN table_reference ON condition
So our select statement is a query_block and the join_clause is of type
table_reference inner_cross_join_clause
where table_reference is table1 t and inner_cross_join_clause is JOIN table2 t ON (t.col_b = t.col_c). The ellipsis ... means that there could be additional inner_cross_join_clauses, but we do not need this here.
in the inner_cross_join_clause the alias t refers to table2. Only if these references cannot be satisfied the aliasmust be searched in an outer scope. So all the following expressions in the ONcondition are valid:
t.col_b = t.col_c
Here t.col_b is table2.col_b because t refers to the alias of its inner_cross_join_clause, t.col_c is table1.col_c. t of the inner_cross_join_clause (refering to table2) has no column col_c so the outer scope will be searched and an appropriate alias will be found.
If we have the clause
t.col_a = t.col_a
the alias can be found as alias defined in the inner_cross_join_clause to which this ON-condition belongs so t will be resolved to table2.
if the select list consists of
t.col_c, t.col_b, t.col_a
instead of * then the join_clause will be searched for an alias and t.col_c will be resolved to table1.col_c (table2 does not contain a column col_c), t.col_b will be resolved to table2.col_b (table1 does not contain a col_b) but t.col_a will raise the error
ORA-00918: column ambiguously defined
because for the select_list none of the aias definition has a precedenve over the other. If our query also has a where_clause then the aliases are resolved in the same way as if they are used in the select_list.
With more data, it will produce different results.
Your colleagues query is same as this.
select * from table3 where t3.col_b = 'XX'
union
select * from table3 where t3.col_c = 'YY'
or
select * from table3 where t3.col_b = 'XX' or t3.col_c = 'YY'
while your query is like this.
select * from table3 where t3.col_b ='XX' and t3.col_c='YY'
First one is like data where (xx or yy) while second one is data where ( xx and yy)

Update multiple rows using select statements

Let's say I have these tables and values:
Table1
------------------------
ID | Value
------------------------
2 | asdf
4 | fdsa
5 | aaaa
Table2
------------------------
ID | Value
------------------------
2 | bbbb
4 | bbbb
5 | bbbb
I want to update all the values in Table2 using the values in Table1 with their respective ID's.
I know I can run this:
UPDATE Table2
SET Value = t1.Value
FROM Table2 t2
INNER JOIN Table1 t1 on t1.ID = t2.ID
But what can I do if Table1 and Table2 are actually select statements with criteria? How can I modify the SQL statement to take that into consideration?
This is how such update queries are generally done in Oracle. Oracle doesn't have an UPDATE FROM option:
UPDATE table2 t2
SET t2.value = ( SELECT t1.value FROM table1 t1
WHERE t1.ID = t2.ID )
WHERE EXISTS ( SELECT 1 FROM table1 t1
WHERE t1.ID = t2.ID );
The WHERE EXISTS clause will make sure that only the rows with a corresponding row in table1 are updated (otherwise every row in table2 will be updated; those without corresponding rows in table1 will be updated to NULL).