Select and group results using the same column as a parameter - sql

I have a query that returns the following result (example):
+----+-----------+------------+
| ID | FirstName | CourseName |
+----+-----------+------------+
| 1 | Alice | X |
| 2 | Bob | X |
| 2 | Bob | Y |
+----+-----------+------------+
the query takes 3 tables (users, user-courses and course), and using JOIN returns the id of the user and his first name, and all the names of all courses he is in.
i need to create a query which returns users who are in specific courses, for example:
select all the users in course X: will return the details both of Alice and Bob.
select all users in courses X AND Y: will return only Bob, since alice isn't in course Y.
the result of the query X AND Y will be:
+----+-----------+
| ID | FirstName |
+----+-----------+
| 2 | Bob |
+----+-----------+

Assuming that user table and course table have an id and a name columns, and user-courses has only foreign key ids, you can do the following:
For the first question:
select u.* from user u
inner join user-courses uc on uc.user_id=u.id
inner join course c on c.id=uc.course_id and c.name='X';
It filters the user on inner joins, and filter the course on tha last part (c.name = 'X'). You can filter in any other way.
For the second one:
select * from user
where id in (
select distinct a.* from (
select user_id from user-courses uc inner join course c
on c.id=uc.course_id
where c.name='X'
) a
inner join (
select user_id from user-courses uc inner join course c
on c.id=uc.course_id
where c.name='Y'
) b
on a.user_id=b.user_id
);
MS-Access don't have intersect, so I used inner join (between a and b) to achieve the same results. A is the table with users from course 'X' and b from 'Y'. The inner join intersect both, resulting in users that are in both courses. Then I used the ids to filter.
I don't have MS-access, so I tried in PostgreSQL, but I used SQL-ANSI, so I hope so.

Related

Left join command is not showing all results

I have a table RESTAURANT:
Id | Name
------------------
0 | 'McDonalds'
1 | 'Burger King'
2 | 'Starbucks'
3 | 'Pans'
And a table ORDER:
Id | ResId | Client
--------------------
0 | 1 | 'Peter'
1 | 2 | 'John'
2 | 2 | 'Peter'
Where 'ResId' is a foreign key from RESTAURANT.Id.
I want to select the number of order per restaurant:
Expected result:
Restaurant | Number of orders
----------------------------------
'McDonalds' | 0
'Burguer King' | 1
'Starbucks' | 2
'Pans' | 0
Actual result:
Restaurant | Number of orders
----------------------------------
'McDonalds' | 0
'Burguer King' | 1
'Starbucks' | 2
Command used:
select r.Name, count(o.ResId)
from RESTAURANT r
left join ORDER o on r.Id like o.ResId
group by o.ResId;
Just fix the group by clause:
select r.name, count(*) as cnt_orders
from restaurants r
left join orders o on r.id = o.resid
group by r.id, r.name;
That way, the SELECT and GROUP BY clauses are consistent; I also added the restaurant id to the group, so potential restaurants having the same name are not aggregated together. I also changed like to =: this is more efficient, and does not alter the logic.
You could also phrase this with a subquery, so there is no need for outer aggregation. I would prefer:
select r.*,
(select count(*) from orders o where o.resid = r.id) as cnt_orders
from restaurants r
Your query should be generating an error because the select columns and the group by columns are incompatible. Just aggregate by the unaggregated columns in the select:
select r.Name, count(o.ResId)
from RESTAURANT r left join
ORDER o
on r.Id = o.ResId
group by r.Name;
Notes:
You might want to include r.id in the GROUP BY (and SELECT) in case restaurants can have the same name.
Note the use of = instead of LIKE. The ids look like numbers, so you should use number operations. LIKE is a string operation.
ORDER is a bad name for a table because it is a SQL keyword.
As a general rule, in a LEFT JOIN, you don't want the aggregation keys to be from the second table, because those values could be NULL.

Excluding results from joined tables based on a single count value

I have two tables that I'm joining, and would like to exclude any result that has a count greater than 1 for a value in the second table.
For this example, I have a table called movie_info, which has information about films, each with a unique ID. A second table called crew_info has information about each film's crew (with film's unique ID), but rather than 1 entry per film, there are multiple entries per crew member. Visually it would be like this:
+----------------------+
| movie_info |
+======================+
| id = '123' |
+----------------------+
+----------------------+
| crew_info |
+======================+
| id = '123' |
+----------------------+
| name = 'John' |
+----------------------+
| role = 'director' |
+----------------------+
+----------------------+
| crew_info |
+======================+
| id = '123' |
+----------------------+
| name = 'Mary' |
+----------------------+
| role = 'director' |
+----------------------+
+----------------------+
| crew_info |
+======================+
| id = '123' |
+----------------------+
| name = 'Sue' |
+----------------------+
| role = 'writer' |
+----------------------+
I join the two tables like so:
SELECT a.id, b.*
FROM movie_info as a
LEFT JOIN crew_info as b
on a.id = b.id
So far it's all standard. What I'm trying to do though is only return results in which crew_info has a count of only 1 director. So a standalone query like this:
SELECT id
FROM crew_info
WHERE role = 'director'
HAVING count(id) = 1
successfully excludes results like this example, where there is more than 1 director. But how exactly do I join this with the movie_info table, so that it's all in one query?
I'm sorry if this is unclear. I am relatively new to SQL so please let me know if there's anything I haven't expressed properly. Thank you.
EDIT: One more thing I forgot about, sorry! If the count is more than 1, how do I still include results if another value is matched? So let's say movie_info also had a field called sequel_id, which is only filled out if the film is a sequel. I want to exclude results that have a director count > 1, AND an empty or null sequel_id, but include results that have a director count > 1 AND a valid sequel_id value. I tried something like (HAVING COUNT(*) = 1 OR (HAVING COUNT(*) > 1 AND sequel_id IS NOT NULL)) but I'm getting a syntax error.
Just use your existing query as an EXISTS sub-query:
SELECT *
FROM movie_info
WHERE EXISTS (
SELECT 1
FROM crew_info
WHERE crew_info.id = movie_info.id
AND crew_info.role = 'director'
HAVING COUNT(*) = 1
)
OR sequel_id IS NOT NULL
try like below by using cte
with cte as
(
SELECT id
FROM crew_info
WHERE role = 'director'
HAVING count(id) > 1
) select a.*,b.id FROM movie_info as a
LEFT JOIN cte as b
on a.id = b.id
without cte you can also subquery
select a.*,b.id FROM movie_info as a
left join (
SELECT id
FROM crew_info
WHERE role = 'director'
group by id
HAVING count(id) > 1
) b on a.id=b.id

How to count occurence of IDs and show this amount with name of item with this ID from other table in SQL?

if I have tables
Person: ID_Person, Name
Profession: ID_Prof, Prof_Name, ID_Person
If ID_Person appears multiple times in second table and I want to show all Person names with number of their professions how can I do this?
I know that if I want to count something I can write
SELECT ID_Person, count(*) as c
FROM Profession
GROUP BY ID_Person;
but don't know how to link it with column from other table in order to proper values.
Here is one way (MySQL InnoDB)
Person
+-----------+-------+
| ID_Person | Name |
+-----------+-------+
| 1 | bob |
| 2 | alice |
+-----------+-------+
Profession
+---------+--------------------+-----------+
| ID_Prof | Prof_Name | ID_Person |
+---------+--------------------+-----------+
| 1 | janitor | 1 |
| 2 | cook | 1 |
| 3 | computer scientist | 2 |
| 4 | home maker | 2 |
| 7 | astronaut | 2 |
+---------+--------------------+-----------+
select Name, count(Prof_Name)
from Person left join Profession
on (Person.ID_Person=Profession.ID_Person)
group by Name;
+-------+------------------+
| Name | count(Prof_Name) |
+-------+------------------+
| alice | 3 |
| bob | 2 |
+-------+------------------+
Hope this helps.
To just show those with multiple Profession then you would join the two tables, and aggregate with count() using group by and filter using having():
select pe.ID_Person, pe.Name, count(*) as ProfessionCount
from Person pe
inner join Profession pr
on pe.ID_Person = pr.ID_Person
group by pe.ID_Person, pe.Name
having count(*)>1
If you want to show the professions for those people as well:
select
multi.ID_Person
, multi.Name
, multi.ProfessionCount
, prof.ID_Prof
, prof.Prof_Name
from (
select pe.ID_Person, pe.Name, count(*) as ProfessionCount
from Person pe
inner join Profession pr
on pe.ID_Person = pr.ID_Person
group by pe.ID_Person, pe.Name
having count(*)>1
) multi
inner join Profession prof
on multi.ID_Person = prof.ID_Person
you can probably try something like this below. However, you will have to think about whether or not you need to left join versus inner join. You would want to left join if there is potentially someone who has not had any professions and therefore does not exist in the professions table.
SELECT pe.Name
, Professions = COUNT(pr.Prof_Name)
FROM dbo.Person (NOLOCK) pe
JOIN dbo.Profession (NOLOCK) pr ON pe.ID_Person = pr.ID_Person
GROUP BY pe.Name
You're looking for something like this I believe. The left join will bring in all the data and won't exclude any users.
The join can also be a inner join. Inner join would then only show users that exist in both tables.
LEFT
select x.ID_Person, count(x.ID_Person) as [count] from table1 x
left join table2 y on y.ID_Person= x.ID_Person
where x.ID_Person <> null
group by x.ID_Person
INNER
select x.ID_Person, count(y.ID_Person) from table1 x
inner join table2 y on y.ID_Person= x.ID_Person
group by x.ID_Person
The easiest solution is probably counting in a subquery:
select
id_person,
name,
(select count(*) from profession pr where pr.id_person = p.id_person) as profession_count
from person p;
You can achieve the same with an outer join:
select
p.id_person,
p.name,
coalesce(pr.cnt, 0) as profession_count
from person p
left join (select id_person, count(*) as cnt from profession group by id_person) pr
on pr.id_person = p.id_person;
It's usually a good idea to aggregate before joining. Anyway, this is how to join first and aggregate then:
select
p.id_person,
p.name,
coalesce(count(pr.id_person), 0) as profession_count
from person p
left join profession pr on pr.id_person = p.id_person
group by p.id_person, p.name;
As per standard SQL it would suffice to group by p.id_person, as the name functionally depends on the id (i.e. the id uniquely defines a person, so it's one single name belonging to it). Some DBMS however don't fully comply with the standard here and demand you to either put the name in the group by clause as shown or dummy-aggregate it in the select clause (e.g. max(p.name)) instead.

Optimizing WHERE clause SQL query

I'm using SQL Server. I find myself doing complex queries in the WHERE clause with the following syntax:
SELECT ..
WHERE StudentID IS NULL OR StudentID NOT IN (SELECT StudentID from Students)
was wondering if there's a better approach/more cleaner way to replace it with because this is a small example of the bigger query I'm doing which includes multiple conditions like that.
As you can see I'm trying to filter for a specific column the rows which its column value is null or not valid id.
EDIT
Courses:
|CourseID | StudentID | StudentID2|
|-----------------------------------|
| 1 | 100 | NULL |
| 2 | NULL | 200 |
| 3 | 1 | 1 |
Students
|StudentID | Name |
|--------------------
| 1 | A |
| 2 | B |
| 3 | C |
Query:
SELECT CourseID
FROM Courses
WHERE
StudentID IS NULL OR StudentID NOT IN (SELECT * FROM Students)
OR StudentID2 IS NULL OR StudentID2 NOT IN (SELECT * FROM Students)
Result:
| CourseID |
|-----------|
| 1 |
| 2 |
As you can see, course 1 and 2 has invalid students.
Alain was close, except the studentID2 column is associated with the courses table. Additionally, this is joining each studentID column to an instance of the students table and the final WHERE is testing if EITHER of the student ID's fail, so even if Student1 is valid and still fails on Student2, it will capture the course as you are intending.
SELECT
C.CourseID
FROM
Courses C
LEFT JOIN Students S
ON C.StudentId = S.StudentId
LEFT JOIN Students S2
OR C.StudentId2 = S2.StudentId
WHERE
S.StudentId IS NULL
OR S2.StudentID IS NULL
this is not a sure shot but i have had experience that this is better performer than the question one:
SELECT CourseID from Courses WHERE
Courses.StudentID NOT exists (SELECT 1 FROM Students where Students.StudentID=nvl(Courses.StudentID,-1));
Also Create an index on StudentId in the Students Table.
And if your data model supports create a primary key foreign key relationship between the 2 tables. That way u definitely avoid invalid values in the courses table.
After your update:
SELECT CourseID from Courses WHERE
Courses.StudentID NOT exists (SELECT 1 FROM Students where Students.StudentID=nvl(Courses.StudentID1,-1) or Students.StudentID=nvl(Courses.StudentID2,-1));
The NOT EXISTS pattern is fine, however, you have several ways to do that.
You should check here and here
For example with LEFT JOIN (two left joins since two variables are checked)
SELECT *
from Courses
LEFT JOIN Students Student1
on Courses.StudentId = Student1.StudentId
LEFT JOIN Students Student2
on Courses.StudentId2 = Student2.StudentId
WHERE
-- No matching Student
student1.StudentId IS NULL
and student2.StudentId IS NULL

sql query related

Hi I have an "Student" table with below records
INSERT into Student(studId,name)
Values(1,A)
values(2,B)
values(3,C)
values(4,D)
I have a "Department" table with below records
INSERT into dept(deptId,Deptname,Emp Id)
Values(D1,Phy,1)
values(D2,Maths,2)
values(D3,Geo,3)
How can i find the student who does not belong to any department i.e. in this case the result should be "D".
I know the left outer join would return all the records from the student table but i am only interested to get 1 record i.e.: of student "D".
SELECT *
FROM Student
WHERE studId NOT IN (SELECT EmpId FROM dept)
Find all Student.studId entries that do not already exist in the dept.EmptId column as an entry. (Assuming I read your table correctly)
Ideally though, you should probably break out the "Student" and "Department" and creating a joining table (maybe called) "Student_Department" that links the keys of each table.
+--------------+ +--------------------+ +--------------+
| Student | | Student_Department | | Dept |
|--------------| |--------------------| |--------------|
| studId | <-----| studId | .-> | deptId |
| name | | deptId | --' | name |
| ... | +--------------------+ | ... |
+--------------+ +--------------+
This allows you to only have to define a student and department once, but can assign one student to multiple departments, one department to multiple students, or any combination therein.
Pseudo-code:
select *
from students
where id not in(select a.id from students a
inner join department b where b.id in('D1','D2','D3'))
While I agree with some commenters that this sounds like a homework problem, I'll answer with a question...
I know the left outer join would return all the records from the student table but i am only interested to get 1 record i.e.: of student "D".
OK, let's say you run the following query:
SELECT * FROM Student
LEFT OUTER JOIN dept ON Student.studId = dept.EmpId
You'd get the results:
studId name deptId deptName EmpId
1 A D1 Phy 1
2 B D2 Maths 2
3 C D3 Geo 3
4 D NULL NULL NULL
Can you add a WHERE clause to this query that will filter out only the data you want? :)
SELECT s.name
FROM Student s
LEFT JOIN Dept d ON d.empId = s.studId
WHERE d.empId IS NULL