SQL: select all unique values in table A which are not in table B - sql

I have table A
Id | Name | Department
-----------------------------
0 | Alice | 1
0 | Alice | 2
1 | Bob | 1
and table B
Id | Name
-------------
0 | Alice
I want to select all unique Ids in table A which do not exist in table B. how can I do this?

select distinct id
from TableA a
where not exists (
select id
from TableB
where id = a.id
)

Just to provide a different solution than NOT IN :
SELECT DISTINCT A.Id
FROM A
LEFT OUTER JOIN B
ON A.Id = B.Id
WHERE B.Id IS NULL
The "good" solution is usually MINUS or EXCEPT, but MySQL doesn't support it.
This question was asked a few time ago and someone posted an article comparing NOT IN, NOT EXISTS and LEFT OUTER JOIN ... IS NULL. It would be interesting if someone could find it again!

The most efficient answer is to use a left join, as using "NOT IN" can sometimes prevent a query from using an index, if present.
The answer in this case would be something like
SELECT DISTINCT
*
FROM
TableA a
LEFT JOIN
TableB b
ON
a.Id = b.Id
WHERE
b.Id IS NULL
Alternatively, this is more readable than a left join, and more efficient than the NOT IN solutions
SELECT * FROM TableA a where NOT EXISTS (SELECT * FROM TableB where Id = a.Id)

I'd use a NOT EXISTS Like this:
SELECT A.Id
FROM TableA A
WHERE NOT EXISTS (SELECT B.Id FROM TableB B WHERE A.Id = B.Id)
GROUP BY A.Id

SELECT DISTINCT Id FROM A WHERE Id NOT IN (SELECT ID FROM B);

SELECT DISTINCT Id FROM A WHERE Id NOT IN(SELECT DISTINCT Id FROM B);

The subquery will get all the IDs in B. The group by...having will get all the unique IDs in A that are also not in B.
select *
from A
where id not in
(
select distinct id
from B
)
group by ID
having count(*) > 1;

Related

How can I show all information in a table and count how many times its shown up in another table?

I am working in SQLite3 where I am trying to use a SELECT statement to show the entire detail of a table and then count how many times it's appeared in another table.
For example: I have 2 tables, A_ID being a foreign key to A and ID being the primary key for Table A
Table A : ID | Name -> info (1,Sam), (2, Michael), (3,Gordon)
Table B : A_ID | Task -> info (1, T1), (1, T2), (2, T3), (3, T4)
OUTPUT: ID | NAME | COUNT() -> info (1 | Sam | 2), (2 | Michael | 1), (3 | Gordon | 1)
I had thought to try
SELECT \*, COUNT(*)
FROM A
WHERE ID = (SELECT A_ID FROM B);
But this statement only showed me the first item and not the rest.
Sorry about the formatting, I'm not too familiar with using this yet. Thank you
You need to use a GROUP BY clause to get the values for each person in A, and LEFT JOIN table A to B on A_ID to get the counts of tasks for each person:
SELECT A.ID, A.Name, COUNT(B.Task) AS Tasks
FROM A
LEFT JOIN B ON B.A_ID = A.ID
GROUP BY A.ID, A.Name
Output (for your sample data):
ID Name Tasks
1 Sam 2
2 Michael 1
3 Gordon 1
Demo on dbfiddle
You should use GROUP BY statement with aggregating functions:
SELECT Name, COUNT(task)
FROM A left join B on A.ID=B.A_ID
GROUP BY Name;
You need a left join of TableA to TableB and group by id and name:
select a.id, a.name, count(b.a_id) counter
from TableA a left join TableB b
on b.a_id = a.id
group by a.id, a.name
For performance, I recommend a correlated subquery:
SELECT a.*, (SELECT COUNT(*) FROM B WHERE B.A_ID = A.ID)
FROM A;
This can take advantage of an index on B(A_ID) and avoids the outer aggregation.

Query to output not existing data

Table A:
id Name
1 a
2 b
3 c
4 d
5 e
Table B:
id Name
3 c
4 d
5 e
Here, id is the primary key connected to Table B.
I need output like this:-
id
1
2
That means, which ids in Table A are not present in Table B
Use EXCEPT operator:
select id from tableA
except
select id from tableB
You can use a left join, which will preserve all records on the left side and associate them with null if no matching record is available on the right side.
This way you can then filter on the right side columns to be null to get the desired outcome
select t1.id
from tableA t1
left join
tableB t2
on t1.id = t2.id
where t2.id is null
Use NOT EXISTS in WHERE clause
SELECT id FROM TableA A
WHERE NOT EXISTS(SELECT 1 FROM TableB B WHERE A.id = B.Id )
Using Not in statement.
Try this:-
Select id from TableA
where id not in (Select id from TableB);
You can use minus:
select * from tableA
minus
select * from tableB

SQL select newst data in 2 table same primary key

I have two same table:
A(id, ..., modified_date)
and B(id, ..., modified_date). I need to select the record with same id but modified_date larger.
How can I write the SQL? Please help.
Example:
Table A
id | user name | email | modified date
------------------------------------------------
1 | Anne | ana#gmail.com | 2016/12/20
And table B
id | user name | email | modified date
------------------------------------------------
1 | Anne Jr, | ana_j#gmail.com | 2017/01/20
With two record has same id, I need to get the record with modified_date larger. The example above, with id = 1, I need to select the record has modified_date = 2017/01/20
You can do a JOIN and then ORDER BY modified_date column like
select t1.id,t1.modified_date
from table1 t1 join table2 t2 on t1.id = t2.id
order by t1.modified_date desc;
If you need data from the B table, you can use :
SELECT b.*
FROM B b
WHERE b.id = a.id
AND b.modified_date > a.modified.date
Similarly, if you need data from the A table you can use :
SELECT a.*
FROM A a
WHERE a.id = b.id
AND a.modified_date > b.modified.date
In case there are multiple records which fit the criteria and you need only the one record which has the greatest modified date value then you can use :
SELECT TOP 1 a.*
FROM A a
WHERE a.id = b.id
AND a.modified_date > b.modified.date
ORDER BY a.modified_date
OR
SELECT TOP 1 b.*
FROM B b
WHERE b.id = a.id
AND b.modified_date > a.modified.date
ORDER BY b.modified_date
Hope this helps!!!
You can try using a CASE expression on SQL Server
SELECT A.id,A.other_columns,
(CASE WHEN a.modified_date > b.modified_date THEN a.modified_date ELSE b.modified_date END) as modified_date
FROM [A] INNER JOIN [B] on A.id=B.id
If you want the higher of those two values, then use greatest() with a join:
select ta.id,
greatest(ta.modifed_date, tb.modified_date)
from table_a ta
join table_b tb on ta.id = tb.id;
If you want all columns from the row with the later date, you can use a case statement:
select ta.id,
case
when ta.modified_date > tb.modified_date then ta.email
else tb.email
end as email,
case
when ta.modifed_date > tb.modified_date then ta.user_name
else tb.user_name
end as user_name,
greatest(ta.modified_date, tb.modified_date) as modified_date
from table_a ta
join table_b tb on ta.id = tb.id;

SQL Select all from table A with counting table B

If I needed a query such that I grab all columns from table A but I also need to count how many B's each row in table A has.
Table A: id | username | email | address
Table B: user_id
SELECT *, total
FROM table_a
WHERE total = (SELECT * FROM table_b WHERE table_a.id==table_b.user_id)
Any ideas?
Edit: For more clarification here is the desired output
1 | steve | steve#steve.steve | 123 Steve | 5 // letters
2 | chris | chris#chris.chris | 123 chris | 2 // letters
SELECT
table_a.id,
table_a.username,
table_a.email,
table_a.address,
count(table_b.user_id) as total
FROM table_a
LEFT OUTER JOIN table_b
ON table_a.id = table_b.user_id
GROUP BY (
table_a.id,
table_a.username,
table_a.email,
table_a.address
)
This is a good example of needing an outer join. If we used an inner join, the query would exclude the entries in table_a which has zero table_b entries.
This could be further refined to meet two challenges:
include all of the columns of table_a without explicitly asking for them.
Handle the zero entry scenario without using non-standard SQL (eg ISNULL, WHERE)
This code below should do it.
SELECT
table_a.*,
tempTable.total
FROM (
SELECT
table_a.Id,
COUNT(table_b.user_id) as total
FROM table_a
LEFT OUTER JOIN table_b
ON table_a.id = table_b.user_id
GROUP BY (table_a.id)
) AS tempTable
INNER JOIN table_a
ON tempTable.Id = table_a.Id;
Comparing this with Cybernate's solution, non-standard SQL looks very attractive :-)
Try this:
SELECT a.*, ISNULL(bcnt, 0) bcnt
FROM TableA a LEFT JOIN
(
SELECT user_id, COUNT(1) AS BCNT
FROM TableB
GROUP BY user_id
) b
ON a.id = b.user_id
You can just use a LEFT JOIN and AGGREGATION functions
SELECT b.user_id,
min(a.username) UserName,
min(a.email) Email,
min(a.address) Address,
COUNT(*) Quantity
FROM table_b b left join
table_a a on a.id=b.user_id
group by b.user_id
select
*,
(select count(*)
from #TableB as B
where A.id = B.user_id) as total
from #TableA as A

select a value where it doesn't exist in another table

I have two tables
Table A:
ID
1
2
3
4
Table B:
ID
1
2
3
I have two requests:
I want to select all rows in table A that table B doesn't have, which in this case is row 4.
I want to delete all rows that table B doesn't have.
I am using SQL Server 2000.
You could use NOT IN:
SELECT A.* FROM A WHERE ID NOT IN(SELECT ID FROM B)
However, meanwhile i prefer NOT EXISTS:
SELECT A.* FROM A WHERE NOT EXISTS(SELECT 1 FROM B WHERE B.ID=A.ID)
There are other options as well, this article explains all advantages and disadvantages very well:
Should I use NOT IN, OUTER APPLY, LEFT OUTER JOIN, EXCEPT, or NOT EXISTS?
For your first question there are at least three common methods to choose from:
NOT EXISTS
NOT IN
LEFT JOIN
The SQL looks like this:
SELECT * FROM TableA WHERE NOT EXISTS (
SELECT NULL
FROM TableB
WHERE TableB.ID = TableA.ID
)
SELECT * FROM TableA WHERE ID NOT IN (
SELECT ID FROM TableB
)
SELECT TableA.* FROM TableA
LEFT JOIN TableB
ON TableA.ID = TableB.ID
WHERE TableB.ID IS NULL
Depending on which database you are using, the performance of each can vary. For SQL Server (not nullable columns):
NOT EXISTS and NOT IN predicates are the best way to search for missing values, as long as both columns in question are NOT NULL.
select ID from A where ID not in (select ID from B);
or
select ID from A except select ID from B;
Your second question:
delete from A where ID not in (select ID from B);
SELECT ID
FROM A
WHERE NOT EXISTS( SELECT 1
FROM B
WHERE B.ID = A.ID
)
This would select 4 in your case
SELECT ID FROM TableA WHERE ID NOT IN (SELECT ID FROM TableB)
This would delete them
DELETE FROM TableA WHERE ID NOT IN (SELECT ID FROM TableB)
SELECT ID
FROM A
WHERE ID NOT IN (
SELECT ID
FROM B);
SELECT ID
FROM A a
WHERE NOT EXISTS (
SELECT 1
FROM B b
WHERE b.ID = a.ID)
SELECT a.ID
FROM A a
LEFT OUTER JOIN B b
ON a.ID = b.ID
WHERE b.ID IS NULL
DELETE
FROM A
WHERE ID NOT IN (
SELECT ID
FROM B)