SQL - Select not repeated rows from 2 tables? - sql

I have 2 tables (perhaps they are badly built).
table1
id | word | user
1 | a | me
2 | b | dad
3 | c | mom
4 | d | sister
table2
id | word | user
1 | a | me
2 | b | dad
I want to show all rows from table1 excluding the rows from table2 which are equal to table1. In this case, the select must display row 3 and 4 from table.
Thanks.

Try this
Select * from Table1
Except
Select * from Table2

You did not specify what RDBMS but you can use NOT EXISTS in all databases:
select *
from table1 t1
where not exists (select *
from table2 t2
where t1.word = t2.word
and t1.user = t2.user
-- add other columns here for comparison including id)
See SQL Fiddle with Demo

Like so:
SELECT *
FROM Table1
WHERE id NOT IN(SELECT id FROM Table2);
Predicate NOT IN Fiddle Demo
Or: using a LEFT JOIN like so:
SELECT t1.*
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.id IS NULL;
LEFT JOIN Fiddle Demo

You can use EXCEPT (SQL-Server >= 2005)
SELECT id, word, user
FROM Table1
EXCEPT
SELECT id, word, user
FROM Table2;
DEMO

As you did not specify what flavour of SQL you are using, it is probably wise to steer clear of EXCEPTS and use standard ANSI SQL. So this is a case for using a left outer join.
SELECT t1.*
FROM table1 AS t1
LEFT OUTER JOIN table2 AS t2
ON t1.word = t2.word
AND t1.user = t2.user
WHERE t2.id IS NULL

Related

Compare columns from 2 different tables with only last inserted values in table_2 in SQL Server

If I have two different tables in a SQL Server 2019 database as follows:
Table1
|id | name |
+-----+--------+
| 1 | rose |
| 2 | peter |
| 3 | ann |
| 4 | rose |
| 5 | ann |
Table2
| name2 |
+--------+
|rose |
|ann |
I would like to retrieve only the last tow ids from table1 (which in this case 4 and 5) that match name2 in table2. In other words, match happens only once on the last added names in table1, furthermore, the ids (4, 5) to be inserted in table2.
How to do that using SQL?
Thank you
You can use row_number()
select name,id from
(
select *, row_number() over(partition by t.name order by id desc) as rn
from table1 t join table2 t1 on t.name=t1.name2
)A where rn=1
Your question is vague, so there could be many answers here. My first thought is that you simply want an inner join. This will fetch ONLY the data that both tables share.
SELECT Table1.*
FROM Table1
INNER JOIN Table2 on Table1.name = Table2.name2
You seem to be describing:
select . . . -- whatever columns you want
from (select top (2) t1.*
from table1 t1
order by t1.id desc
) t1 join
table2 t2
on t2.name2 = t1.name;
This doesn't seem particularly useful for the data you have provided, but it does what you describe.
EDIT:
If you want only the most recent rows that match, use row_number():
select . . . -- whatever columns you want
from (select t1.*,
row_number() over (partition by name order by id desc) as seqnum
from table1 t1
) t1 join
table2 t2
on t2.name2 = t1.name and t1.seqnum = 1;

sql synthesis query

good morning.
Can anyone help me with the query, summarizing the total code in each table (Table 1 + table 2) as in table 3, thanks a lot.
Sorry, my english is not good
SELECT A.code, count(A.Date) as bang_1 , B.bang_2
FROM table_1 A
LEFT JOIN (SELECT code,count(*) as bang_2
FROM table_2
GROUP BY code) B ON A.code = B.code
GROUP BY A.code, B.bang_2
I think I understand what you are trying to do:
select codes.codename, t1.c as Table1, t2.c as Table2
from (
select codename from table1
union select codename from table2
) codes
left join (select codename, count(*) c from table1 group by codename) t1 on codes.codename = t1.codename
left join (select codename, count(*) c from table2 group by codename) t2 on codes.codename = t2.codename
group by codes.codename
order by codes.codename;
Example: https://dbfiddle.uk/?rdbms=mysql_5.5&fiddle=74d3bf30af406a2cdf65185a5fdcc564
Explanation
There are 2 subqueries t1 and t2. Both of them take a count of each code from each of the tables. The codes subquery combines codenames from both tables in case one table had more codes than the other.
Then, using the combined codenames connect with t1 and t2 and collect its respective counts.
Result
codename | Table1 | Table2
:------- | -----: | -----:
code1 | 2 | 1
code2 | 2 | 3
code3 | 1 | 1

Find values where related must have list of values

I'm trying to find a simple solution for my SQL Server problem.
I have two tables look like this:
table1
--id
-- data
table2
--id
--table1_id
--value
I have some records like this:
Table1
+-----------------------+
| id | data |
+-----------------------+
| 1 | ? |
+-----------------------+
| 2 | ? |
+-----------------------+
Table2
+-----------------------+
|id | table1_id | value |
+-----------------------+
| 1 | 1 | 'a' |
+-----------------------+
| 2 | 1 | 'b' |
+-----------------------+
| 3 | 2 | 'a' |
+-----------------------+
Now I want to get table1 with all it's additional values where the relation to table2 has 'a' AND 'b' as values.
So I would get the id 1 of table1.
Currently I have an query like this:
SELECT t1.[id], t1.[data]
FROM [table1] t1,
(SELECT [id]
FROM [table1] t1
JOIN [table2] t2 ON t1.[id] = t2.[table1_id] AND t2.[Value] IN('a', 'b')
GROUP BY t1[id]
HAVING COUNT(t2.[Value]) = 2) x
WHERE t1.id = x.id
Has anyone an idea on how to achieve my goal in a simpler way?
One way uses exists:
select t1.*
from table1 t1
where exists (select 1
from table2 t2
where t2.table1_id = t1.id and t2.value = 'a'
) and
exists (select 1
from table2 t2
where t2.table1_id = t1.id and t2.value = 'b'
);
This can take advantage of an index on table2(table1_id, value).
You could also write:
select t1.*
from table1 t1
where (select count(distinct t2.value)
from table2 t2
where t2.table1_id = t1.id and t2.value in ('a', 'b')
) = 2 ;
This would probably also have very good performance with the index, if table2 doesn't have duplicates.
SELECT T1.[id], T1.[data]
FROM table1 AS T1
JOIN table2 AS T2
ON T1.[id]=T2.[table1_id]
JOIN table2 AS T3
ON T1.[id]=T3.[table1_id]
WHERE
T2.[Value] ='a'
AND T3.[Value] = 'b'
As Gordon Linoff suggested, exists clause usage works as well and could be performance efficient depending on the data you are playing with.
you have to do several steps to solve the problem:
established which records are related to table 1 and table 2 and which of these are of value (A or B) and eliminate the repeated ones with the group by(InfoRelationate )
validate that only those related to a and b were allowed by means of a count in the table above (ValidateAYB)
see what data meets the condition of table1 and table 2 and joined table 1
this query meets the conditions
with InfoRelationate as
(
select Table2.table1_id,value
from Table2 inner join
Table1 on Table2.table1_id=Table1.id and Table2.value IN('a', 'b')
group by Table2.table1_id,value
),
ValidateAYB as
(
select InfoRelationate.table1_id
from InfoRelationate
group by InfoRelationate.table1_id
having count (1)=2
)
select InfoRelationate.table1_id,InfoRelationate.value
from InfoRelationate
inner join ValidateAYB on InfoRelationate.table1_id=ValidateAYB.table1_id
union all
select id,data
from Table1
Example code

Using the same table alias twice in a query

My coworker, who is new to ANSI join syntax, recently wrote a query like this:
SELECT count(*)
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c);
Note that table3 is joined to both table1 and table2 on different columns, but the two JOIN clauses use the same table alias for table3.
The query runs, but I'm unsure of it's validity. Is this a valid way of writing this query?
I thought the join should be like this:
SELECT count(*)
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c);
Are the two versions functionally identical? I don't really have enough data in our database yet to be sure.
Thanks.
The first query is a join of 4 tables, the second one is a join of 3 tables. So I don't expect that both queries return the same numbers of rows.
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c);
The alias t3 is only used in the ON clause. The alias t3 refers to the table before the ON keyword. I found this out by experimenting. So the pervious query is equvivalent to
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t4 ON
(t4.col_c = t1.col_c);
and this can be transfotmed in a traditional join
SELECT *
FROM table1 t1,
table2 t2,
table3 t3,
table3 t4
where (t1.col_a = t2.col_a)
and (t2.col_b = t3.col_b)
and (t4.col_c = t1.col_c);
The second query is
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c);
This can also transformed in a traditional join
SELECT *
FROM table1 t1,
table2 t2,
table3 t3
where (t1.col_a = t2.col_a)
and (t2.col_b = t3.col_b)
AND (t3.col_c = t1.col_c);
These queries seem to be different. To proof their difference we use the following example:
create table table1(
col_a number,
col_c number
);
create table table2(
col_a number,
col_b number
);
create table table3(
col_b number,
col_c number
);
insert into table1(col_a, col_c) values(1,3);
insert into table1(col_a, col_c) values(4,3);
insert into table2(col_a, col_b) values(1,2);
insert into table2(col_a, col_b) values(4,2);
insert into table3(col_b, col_c) values(2,3);
insert into table3(col_b, col_c) values(2,5);
insert into table3(col_b, col_c) values(7,9);
commit;
We get the following output
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c)
| COL_A | COL_C | COL_A | COL_B | COL_B | COL_C | COL_B | COL_C |
|-------|-------|-------|-------|-------|-------|-------|-------|
| 1 | 3 | 1 | 2 | 2 | 3 | 2 | 3 |
| 4 | 3 | 4 | 2 | 2 | 3 | 2 | 3 |
| 1 | 3 | 1 | 2 | 2 | 5 | 2 | 3 |
| 4 | 3 | 4 | 2 | 2 | 5 | 2 | 3 |
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c)
| COL_A | COL_C | COL_A | COL_B | COL_B | COL_C |
|-------|-------|-------|-------|-------|-------|
| 4 | 3 | 4 | 2 | 2 | 3 |
| 1 | 3 | 1 | 2 | 2 | 3 |
The number of rows retrieved is different and so count(*) is different.
The usage of the aliases was surprising. at least for me.
The following query works because t1 in the where_clause references table2.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_b<0;
The following query works because t1 in the where_clause references table1.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_c<0;
The following query raises an error because both table1 and table2 contain a column col_a.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_a<0;
The error thrown is
ORA-00918: column ambiguously defined
The following query works, the alias t1 refers to two different tables in the same where_clause.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_b<0 and t1.col_c<0;
These and more examples can be found here: http://sqlfiddle.com/#!4/84feb/12
The smallest counter example
The smallest counter example is
table1
col_a col_c
1 2
table2
col_a col_b
1 3
table3
col_b col_c
3 5
6 2
Here the second query has an empty result set and the first query returns one row. It can be shown that the count(*) of the second query never exeeds the count(*)of the first query.
A more detailed explanation
This behaviour will became more clear if we analyze the following statement in detail.
SELECT t.col_b, t.col_c
FROM table1 t
JOIN table2 t ON
(t.col_b = t.col_c) ;
Here is the reduced syntax for this query in Backus–Naur form derived from the syntax descriptions in the SQL Language Reference of Oracle 12.2. Note that under each syntax diagram there is a link to the Backus–Naur form of this diagram, e.g Description of the illustration select.eps. "reduced" means that I left out all the possibilities that where not used, e,g. the select is defined as
select::=subquery [ for_update_clause ] ;
Our query does not use the optional for_update_clause, so I reduced the rule to
select::=subquery
The only exemption is the optional where-clause. I didn't remove it so that this reduced rules can be used to analyze the above query even if we add a where_clause.
These reduced rule will define only a subset of all possible select statements.
select::=subquery
subquery::=query_block
query_block::=SELECT select_list FROM join_clause [ where_clause ]
join_clause::=table_reference inner_cross_join_clause ...
table_reference::=query_table_expression t_alias query_table_expression::=table
inner_cross_join_clause::=JOIN table_reference ON condition
So our select statement is a query_block and the join_clause is of type
table_reference inner_cross_join_clause
where table_reference is table1 t and inner_cross_join_clause is JOIN table2 t ON (t.col_b = t.col_c). The ellipsis ... means that there could be additional inner_cross_join_clauses, but we do not need this here.
in the inner_cross_join_clause the alias t refers to table2. Only if these references cannot be satisfied the aliasmust be searched in an outer scope. So all the following expressions in the ONcondition are valid:
t.col_b = t.col_c
Here t.col_b is table2.col_b because t refers to the alias of its inner_cross_join_clause, t.col_c is table1.col_c. t of the inner_cross_join_clause (refering to table2) has no column col_c so the outer scope will be searched and an appropriate alias will be found.
If we have the clause
t.col_a = t.col_a
the alias can be found as alias defined in the inner_cross_join_clause to which this ON-condition belongs so t will be resolved to table2.
if the select list consists of
t.col_c, t.col_b, t.col_a
instead of * then the join_clause will be searched for an alias and t.col_c will be resolved to table1.col_c (table2 does not contain a column col_c), t.col_b will be resolved to table2.col_b (table1 does not contain a col_b) but t.col_a will raise the error
ORA-00918: column ambiguously defined
because for the select_list none of the aias definition has a precedenve over the other. If our query also has a where_clause then the aliases are resolved in the same way as if they are used in the select_list.
With more data, it will produce different results.
Your colleagues query is same as this.
select * from table3 where t3.col_b = 'XX'
union
select * from table3 where t3.col_c = 'YY'
or
select * from table3 where t3.col_b = 'XX' or t3.col_c = 'YY'
while your query is like this.
select * from table3 where t3.col_b ='XX' and t3.col_c='YY'
First one is like data where (xx or yy) while second one is data where ( xx and yy)

SQL how to simulate an xor?

I'm wondering if anybody can help me solve this question I got at a job interview. Let's say I have two tables like:
table1 table2
------------ -------------
id | name id | name
------------ -------------
1 | alpha 1 | alpha
3 | charlie 3 | charlie
4 | delta 5 | echo
8 | hotel 7 | golf
9 | india
The question was to write a SQL query that would return all the rows that are in either table1 or table2 but not both, i.e.:
result
------------
id | name
------------
4 | delta
5 | echo
7 | golf
8 | hotel
9 | india
I thought I could do something like a full outer join:
SELECT table1.*, table2.*
FROM table1 FULL OUTER JOIN table2
ON table1.id=table2.id
WHERE table1.id IS NULL or table2.id IS NULL
but that gives me a syntax error on SQL Fiddle (I don't think it supports the FULL OUTER JOIN syntax). Other than that, I can't even figure out a way to just concatenate the rows of the two tables, let alone filtering out rows that appear in both. Can somebody enlighten me and tell me how to do this? Thanks.
Well, you could use UNION instead of OUTER JOIN.
SELECT * FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
UNION
SELECT * FROM table1 t1
RIGHT JOIN table2 t2 ON t1.id = t2.id
Here's a little trick I know: not equals is the same as XOR, so you could have your WHERE clause something like this:
WHERE ( table1.id IS NULL ) != ( table2.id IS NULL )
select id,name--,COUNT(*)
from(
select id,name from table1
union all
select id,name from table2
) x
group by id,name
having COUNT(*)=1
I'm sure there are lots of solutions, but the first thing that comes to mind for me is to union all the two tables, then group by name, filter with a having clause on the count.
(
SELECT * FROM TABLE1
EXCEPT
SELECT * FROM TABLE2
)
UNION ALL
(
SELECT * FROM TABLE2
EXCEPT
SELECT * FROM TABLE1
)
This should work on most database servers
SELECT id, name
FROM table1
WHERE NOT EXISTS(SELECT NULL FROM table2 WHERE table1.id = table2.id AND table1.name = table2.name)
UNION ALL
SELECT id, name
FROM table2
WHERE NOT EXISTS(SELECT NULL FROM table1 WHERE table1.id = table2.id AND table1.name = table2.name)