SQL Self-join, Not Exists, Or Something Else? - sql

many-to-many name and role table --
create table t (name varchar, role varchar) ;
insert into t (name, role) values ('joe', 'husband'), ('joe', 'father'),
('tom', 'husband'), ('neo', 'bachelor') ;
> select * from t;
name | role
------+----------
joe | husband
joe | father
tom | husband
neo | bachelor
need to convert into mapping of name and the role(s) he does not have --
not_a | name
---------+-----------
husband | neo
father | tom
father | neo
bachelor | joe
bachelor | tom
How to achieve that in true SQL without iterating through each role/name?

To get roles that someone doesn't have is a little complicated. You have to generate all pairs of names and roles and then pick out the ones that don't exist. This uses a left outer join.
The following is standard SQL for doing this:
select r.role as not_a, n.name
from (select distinct name from t) n cross join
(select distinct role from t) r left outer join
t
on t.name = n.name and t.role = r.role
where t.name is null;
As a note: never use varchar() without a length when defining variables and columns. The default values may not do what you expect.

Assuming you only have this table you can use:
SELECT r.role AS not_a, n.Name
FROM (SELECT DISTINCT Name FROM T) AS n
CROSS JOIN (SELECT DISTINCT Role FROM T) AS r
WHERE NOT EXISTS
( SELECT 1
FROM t
WHERE t.Name = n.Name
AND t.Role = r.Role
);
Example on SQL Fiddle
The main query will generate all pairs of names/roles, then the not exists will exlcude all the pairs that already exist.
If you actually have a name and role table, then you can replace the subqueries with the actual tables:
SELECT r.role AS not_a, n.Name
FROM Names AS n
CROSS JOIN Roles AS r
WHERE NOT EXISTS
( SELECT 1
FROM t
WHERE t.Name = n.Name
AND t.Role = r.Role
);
You haven't specified a DBMS, so if you are using MySQL, using LEFT JOIN\IS NULL will perform better than NOT EXISTS
SELECT r.role AS not_a, n.Name
FROM (SELECT DISTINCT Name FROM T) AS n
CROSS JOIN (SELECT DISTINCT Role FROM T) AS r
LEFT JOIN t
ON t.Name = n.Name
AND t.Role = r.Role
WHERE t.Name IS NULL;
I am also assuming it was just a demo, but in your table DDL you have used VARCHAR without a length which is not a good idea at all

Related

How to avoid joining the same table multiple times

In the simplified example:
The idea is to get all (player, coach, and ref) names into the final query, but the only way I can think to do that is to join 3 times on the respective id. What is a better way?
Team
...|Coachid | Playerid | Refid|
--------------------------
...| 98 | 23 | 77 |
Name
Id | Name |
--------------------
98 | Andy |
23 | Charlie |
SELECT [t].[Id],
[t].[TeamName],
[c].[Name] AS CoachName,
[p].[Name] AS PlayeName,
[r].[Name] as RefName
FROM Team [t]
JOIN Name [c]
ON c.id = t.Coachid
JOIN Name [p]
ON p.id = t.PlayerId
JOIN Name [r]
ON r.id = t.RefId
As it has been commented, your approach is the best way to adress your case and should have good performance.
Alternatives would include:
1) A series of correlated subqueries - this is OK because there is just one value to return per relation:
SELECT
t.Id,
t.TeamName,
(SELECT n.Name FROM AS Name n WHERE n.id = t.CoachId) CoachName,
(SELECT n.Name FROM AS Name n WHERE n.id = t.PlayerId) PlayerName,
(SELECT n.Name FROM AS Name n WHERE n.id = t.RefId) RefName
FROM Team t
2) Conditional aggregation - makes the query more cumbersome:
SELECT
t.Id,
t.TeamName,
MAX(CASE WHEN n.id = t.CoachId THEN n.Name END) CoachName,
MAX(CASE WHEN n.id = t.PlayerId THEN n.Name END) PlayerName,
MAX(CASE WHEN n.id = t.RefId THEN n.Name END) RefName
FROM Team t
INNER JOIN Name n ON n.id IN (t.CoachId, t.PlayerId, t.RefId)
GROUP BY t.Id, t.TeamName
With the way your table is structured, the way you provided is the best way to do it.
It is odd though, the way you have structured the Team table.
You can simplify the whole query just to one join, if you had made your Team table just have the id's of each person, and in your Name table, you would have the id and the role / position of that id.
I guess this is a good example of how important it is to structure your tables correctly.

Query sql to get the first occurrence in a many to many relationship

I have a User table that has a many to many relationship with Areas. This relationship is stored in the Rel_User_area table. I want to show the user name and the first area that appears in the list of areas.
Ex.
User
id | Name
1 | Peter
2 | Joe
Area
id | Name
1 | Area A
2 | Area B
3 | Area C
Rel_User_area
iduser | idarea
1 | 1
1 | 3
2 | 3
The result I want:
User Name | Area
Peter |Area A
Joe |Area C
Using the minimum area id to determine "First" you could use a correlated subquery (A subquery that refers to field(s) in the main query to filter results):
SELECT user.name, area.name
FROM
user
INNER JOIN Rel_User_Area RUA ON user.id = RUA.iduser
INNER JOIN Area ON RUA.idarea = area.id
WHERE area.id = (SELECT min(idarea) FROM Rel_User_Area WHERE iduser = RUA.iduser)
There's other ways of doing this that may be RDBMS specific. Like in Teradata I would use a QUALIFY clause that doesn't exist in MySQL, SQL Server, Oracle, Postgres, etc.. Regardless of the RDBMS the above should work.
SELECT user.name, area.name
FROM
user
INNER JOIN Rel_User_Area RUA ON user.id = RUA.iduser
INNER JOIN Area ON RUA.idarea = area.id
QUALIFY ROW_NUMBER() OVER (PARTITION BY user.id ORDER BY area.id ASC) = 1;
using the ID from Rel_user_Area you mentioned in comments...
This should be pretty platform independent.
SELECT U.name as Username, A.Name as Area
FROM (SELECT min(ID) minID, IDUser, IDarea
FROM Rel_user_Area
GROUP BY IDUser, IDarea) UA
INNER JOIN User U
on U.ID = UA.IDuser
INNER JOIN Area A
on A.ID = UA.IDArea
If Cross apply and top work (could substitute limit 1 vs top if Postgresql or mySQL)
This will run the cross apply SQL once for each record in user; thus you get the most recent rel_user_Area ID per user.
SELECT U.name as Username, A.Name as Area
FROM User U
on U.ID = UA.IDuser
CROSS APPLY (SELECT TOP 1 IDUser, IDArea
FROM Rel_user_Area z
WHERE Z.IDUSER = U.ID
ORDER BY ID ASC) UA
INNER JOIN Area A
on A.ID = UA.IDArea

How to have IN and NOT IN at same time

Can someone help me to figure out how is the best way to do this?
I have a list of people with cars. I need to execute a query that will return people that have a type of car and don't have another type at the same time.
Here is my example:
ID Name CarType
----------- ---------- ----------
1 John MINI VAN
1 John SUV
2 Mary SUV
2 Mary SEDAN
3 Paul SPORT
3 Paul TRUCK
4 Joe SUV
4 Joe MINI VAN
For instance, I want to display only people that have SUV AND DON'T have MINI VAN. If we try the clause CarType IN ('SUV') AND NOT IN ('MINI VAN'), this will not work, because the second statement is just ignored.
In order to return people that have a type but don't have another type at the same time, I tried the following:
Create a temporary table with the IN clause, let's say #Contains
Create a temporary table with the NOT IN clause, let's say #DoesNotContain
Join table with #Contains, this will do the IN clause
On the where clause, look for IDs that are not in #DoesNotContain table.
The query that I am using is this:
--This is the IN Clause
declare #Contains table(
ID int not null
)
--This is the NOT IN Clause
declare #DoesNotContains table(
ID int not null
)
--Select IN
insert into #Contains
SELECT ID from #temp where CarType = 'SUV'
--Select NOT IN
insert into #DoesNotContains
SELECT ID from #temp where CarType = 'MINI VAN'
SELECT
a.ID, Name
FROM
#temp a
INNER JOIN #Contains b on b.ID = a.ID
WHERE
a.ID NOT IN (SELECT ID FROM #DoesNotContains)
Group by
a.ID, Name
This will return Mary because she has a SUV but does not have a MINI VAN.
Here are my questions:
Is it possible to execute this IN and NOT IN in the query, without temp tables? Is there something new in SQL that does that? (Sorry, last time I worked with SQL was SQL 2005)
Should we use temp tables for this?
If this is the way to go, should I use IN and NOT IN instead of the JOIN?
How to replace the NOT IN clause with a JOIN?
Thank y'all!
EDIT
I just tested the solutions but unfortunately I did not specify that I need a combination of cartypes. My bad :(
For instance, if I want all users that have SUV and MINI VAN but not TRUCK AND NOT SEDAN. In this case it only John is returned.
This is normally accomplished with a single query in standard SQL, using NOT EXISTS:
SELECT *
FROM mytable AS t1
WHERE CarType = 'SUV' AND
NOT EXISTS (SELECT *
FROM mytable AS t2
WHERE t1.Name = t2.Name AND t2.CarType = 'MINI VAN')
The above query will select all people having CarType = 'SUV', but do not have CarType = 'MINI VAN'.
Here's one way
SELECT Id, Name
FROM Cars
WHERE CarType = 'SUV'
EXCEPT
SELECT Id, Name
FROM Cars
WHERE CarType = 'MINI VAN'
Or another
SELECT Id, Name
FROM Cars
WHERE CarType IN ('SUV', 'MINI VAN')
GROUP BY Id, Name
HAVING MIN(CarType) = 'SUV'
Or a more generic version that addresses the different requirement in the comment.
SELECT Id,
NAME
FROM Cars
WHERE CarType IN ( 'SUV', 'MINI VAN', 'TRUCK')
GROUP BY Id,
NAME
HAVING COUNT(DISTINCT CASE
WHEN CarType IN ( 'SUV', 'MINI VAN' ) THEN CarType
END) = 2
AND COUNT(DISTINCT CASE
WHEN CarType IN ( 'TRUCK' ) THEN CarType
END) = 0
Using LEFT JOIN:
SELECT a.ID,
Name
FROM #temp a
INNER JOIN #Contains b ON b.ID = a.ID
LEFT OUTER JOIN #DoesNotContains c ON c.ID = a.ID
WHERE c.ID IS NULL
The INNER JOIN will return records where b.ID and a.ID match.
The LEFT OUTER JOIN returns all records, with NULL where there is no match - adding WHERE c.ID IS NULL returns records from a that don't match to c.
The keyword except is your friend. This is the general idea
where carType in
(select carType
from cars
where you want to include them
except
select carType
from cars
where you want to exclude them)
You can work out the details.

How to fetch the non matching rows in Oracle

Can anyone help me fetch the non matching rows from two tables in Oracle?
Table: Names
Class_id Stud_name
S001 JAMES
S001 PETER
S002 MARK
Table: Course
Course_id Stud_name
S001 JAMES
S001 KEITH
S002 MARK
Output
I need the rows to display as
CLASS ID STUD_NAME_FROM_NAME_TABLE STUD_NAME_FROM_COURSE_TABLE
---------------------------------------------------------------------
S001 PETER KEITH
I have used Oracle joins to fetch the non matching names:
SELECT *
FROM Names, Course
WHERE Names.Class_id=Course.Course_id
AND Names.Stud_name<>Course.Stud_name
This query is returning duplicate rows.
If you insist on Join you can use this one:
SELECT *
FROM Names
FULL OUTER JOIN Course ON Names.Class_id=Course.Course_id
AND Names.Stud_name = Course.Stud_name
WHERE Names.Stud_name IS NULL or Course.Stud_name IS NULL
Fetches unmatched rows in Names table
SELECT * FROM Names
WHERE
NOT EXISTS
(SELECT 'x' from Course
WHERE
Names.Class_id = Course.Course_id AND
Names.Stud_name = Course.Stud_name)
Fetches unmatched rows in Names and Course too!
SELECT Names.Class_id,Names.Stud_name,C1.Stud_name
FROM Names , Course C1
WHERE Names.Class_id = C1.Course_id AND
NOT EXISTS
(SELECT 'x' from Course C2
WHERE
Names.Class_id = C2.Course_id AND
Names.Stud_name = C2.Stud_name);
When you ask for unmatching rows I assume that you want rows that exist in names but not in course.
If this is the case you're probably after
select * from names
where (class_id, stud_name ) not in
(select course_id, stud_name from course);
Your query returned duplicate rows beacuse for each row in names it selected all rows in course that satisfied the where condition.
So, for the row S001, PETER in names it faound that S001, JAMES and S001, KEITH matched that condition, thus, that row was "returned" twice.
EDIT Since it is not clear if stud_name is a primary key, or unique (and on second sight I think it's not), you'd probably want a
select * from names
where not exists (
select 1 from course where
names.class_id = course.course_id and
names.stud_name <> course.stud_name
)
Edit II if you insist on using a join (as per your comment) you might want to try a
select distinct names.* from...
Hope it helps you
with not_in_class as
(select a.*
from Names a
where not exists ( select 'x'
from course b
where b.Course_id = a.class_id
and a.Stud_name = b.Stud_name)),
not_in_course as
(select b.*
from course b
where not exists ( select 'x'
from Names a
where b.Course_id = a.class_id
and a.Stud_name = b.Stud_name))
select x.class_id,
x.Stud_name NOT_IN_CLASS,
y.stud_name NOT_IN_COURSE
from not_in_class x, not_in_course y
where x.class_id = y.course_id
Output
| CLASS_ID | NOT_IN_CLASS | NOT_IN_COURSE |
|----------|--------------|---------------|
| S001 | PETER | KEITH |
Only problem is that if multiple mismatches are there in both the tables for a given id, it works for single mismatch for a particular id. You need to rework if multiple mismatches are there for the same id.
Well, I am not sure if I understand correctly what you are asking. I think you want a list of all IDs where the student list in class table and course table differs. Then you want to show the id and the students that are in class but not in course and the students that are in course but not in class.
To do so you would full outer join the tables. That gives you students that are both in class and course, students that are in class and not in course, and students that are in course and not in class. Filter your results where either class_id or course_id is null then to get the students missing in course or class. At last group by id and list the students.
select coalesce(class.class_id, course.course_id) as id
, listagg(class.stud_name, ',') within group (order by class.stud_name) as missing_in_course
, listagg(course.stud_name, ',') within group (order by course.stud_name) as missing_in_class
from class
full outer join course
on (class.class_id = course.course_id and class.stud_name = course.stud_name)
where class.class_id is null or course.course_id is null
group by coalesce(class.class_id, course.course_id);
Here is the SQL fiddle showing how it works: http://sqlfiddle.com/#!4/8aaaa/2
EDIT: In Oracle 9i there is no listagg. You can use the inofficial function wm_concat instead:
select coalesce(class.class_id, course.course_id) as id
, wm_concat(class.stud_name) as missing_in_course
, wm_concat(course.stud_name) as missing_in_class
from class
full outer join course
on (class.class_id = course.course_id and class.stud_name = course.stud_name)
where class.class_id is null or course.course_id is null
group by coalesce(class.class_id, course.course_id);

SQL - Select all skills

it's been a while since I used SQL so I'm asking sorry if it's too easy.
I have to select all the skills that a user has, so I have three tables.
User (id, name)
Skills (id, name)
User_skills (id_user, id_skill)
If the user1 has 2 skills; for example Hibernate (id 1) and Java (id 2)
and the user2 has 1 skill; Java (id 1)
Passing 1 and 2, I want to retrieve users that have both.
With the IN() function I get all the users that have at least one of the skills, but I want to filter them out!
Thanks to all in advance
If one skill can only be assigned exactly once to a user (i.e. (id_user, id_skill) is the PK for the user_skills table), then the following will do what you want:
SELECT id_user
FROM user_skills
WHERE id_skill IN (1,2)
GROUP BY id_user
HAVING count(*) = 2
Join to the association table user_skills twice, putting the skill ID in the on clause of each join:
select u.*
from user u
join user_skills us1 on us1.id_user = u.id and us1.id_skill = 1
join user_skills us2 on us2.id_user = u.id and us2.id_skill = 2
By using join (and not left join) this query requires the user have both skills
SELECT name FROM user as u
WHERE
EXISTS( SELECT 1 FROM User_skills WHERE id_user=u.id AND id_skill=1 )
AND EXISTS( SELECT 1 FROM User_skills WHERE id_user=u.id AND id_skill=2 )