Find Missing Pairs in SQL - sql

Assume there's a relational database with 3 tables:
Courses {name, id},
Students {name, id},
Student_Course {student_id, course_id}
I want to write an SQL that gives me the student-course pairs that do NOT exist. If that is not feasible, at least it'd be good to know if there are missing pairs or not.
Also, since this is a small part of a larger problem I'd like to automate, seeing many different ways of doing it would be useful.

1st find all pairs and then remove pairs present (either by left join/not null or not exists)
select s.id as student_id, c.id as course_id
from Courses as c
cross join Students as s
left join Student_Course as sc on sc.student_id = s.id and sc.course_id = c.id
where sc.course_id is null -- any sc field defined as "not null"

with Courses as(
select 1 as id,'Math' as name union all
select 2 as id,'English' as name union all
select 3 as id,'Physics' as name union all
select 4 as id,'Chemistry' as name),
Students as(
select 1 as id,'John' as name union all
select 2 as id,'Joseph' as name union all
select 3 as id,'George' as name union all
select 4 as id,'Michael' as name
),
studcrse as(
select 1 as studid, 1 as crseid union all
select 1 as studid, 2 as crseid union all
select 1 as studid, 3 as crseid union all
select 2 as studid, 3 as crseid union all
select 2 as studid, 4 as crseid union all
select 3 as studid, 1 as crseid union all
select 3 as studid, 2 as crseid union all
select 3 as studid, 4 as crseid union all
select 3 as studid, 3 as crseid union all
select 4 as studid, 4 as crseid )
SELECT A.ID AS studentId,a.name as studentname,b.id as crseid,b.name as crsename
from Students as a
cross join
Courses as b
where not exists
(
select 1 from studcrse as c
where c.studid=a.id
and c.crseid=b.id)

Related

Union v/s Inner join on 3 conditions in SQL

Will the following 2 SQL statements give same result?
SQL1-
A inner join B on (condition1)
union
A inner join B on (condition2)
union
A inner join B on (condition3)
SQL2-
A inner join B on (condition1) OR (condition2) OR (condition3)
At least it depends on A or B originally having doubles. For example
with A(c) as (
select 1 union all
select 1 union all
select 2
),
B(c) as (
select 1 union all
select 2 union all
select 3
)
select *
from A join B on A.c=B.c
union
select *
from A join B on A.c>B.c
returns 3 rows (distinct).
with A(c) as (
select 1 union all
select 1 union all
select 2
),
B(c) as (
select 1 union all
select 2 union all
select 3
)
select *
from A join B on A.c=B.c or A.c>B.c
returns 4 rows due to A having doubles.

Multiple Unioned Self-joins in BigQuery

I have a table with id, name and parent_id where parent_id is a parent hierarchy relating to id, see below.
id
name
parent_id
0
A
null
1
B
0
2
C
1
3
D
1
4
E
2
I'm trying to create a nicer looking table with each id and its parent_id, including multiple levels up in the hierarchy. I use UNION and self-join to accomplish this, but I have a feeling there should be a nicer way of querying it with BigQuery's Standard SQL.
In the query below I go two levels, but you can imagine I want to go 5-6 levels.
WITH T1 as (
select 0 as id, 'A' as name, null as parent_id union all
select 1 as id, 'B' as name, 0 as parent_id union all
select 2 as id, 'C' as name, 1 as parent_id union all
select 3 as id, 'D' as name, 1 as parent_id union all
select 4 as id, 'E' as name, 2 as parent_id
)
SELECT
a.id as id,
a.name as req_name,
FROM T1 as a
UNION ALL
SELECT
a.id as id,
b.name as req_name,
FROM T1 as a
JOIN T1 as b ON a.parent_id = b.id
UNION ALL
SELECT
a.id as id,
c.name as req_name,
FROM T1 as a
JOIN T1 as b on a.parent_id = b.id
JOIN T1 as c on b.parent_id = c.id
resulting in the table
id
req_name
0
A
1
B
2
C
3
D
4
E
2
A
3
A
4
B
1
A
2
B
3
B
4
C
I would be thankful for any insights!
BigQuery does not (yet) support recursive or hierarchical queries. So your approach is actually fine. You can condense it, if you like, using left joins:
with t as (
select 0 as id, 'A' as name, null as parent_id union all
select 1 as id, 'B' as name, 0 as parent_id union all
select 2 as id, 'C' as name, 1 as parent_id union all
select 3 as id, 'D' as name, 1 as parent_id union all
select 4 as id, 'E' as name, 2 as parent_id
)
select distinct id, t1.name
from t t1 left join
t t2
on t2.parent_id = t1.id left join
t t3
on t3.parent_id = t2.id cross join
unnest(array[t1.id, t2.id, t3.id]) id
where id is not null;
You still need explicit joins to the maximum depth of the data.
The other alternative is to use a looping construct, which is available in the scripting language.

How to write SQL join to find description of id using Oracle?

I have 2 input tables, and I need output in string format.
I tried following query, but it does not work. How can I get the above output?
with
cte1 as --table 1
(select 1 as id , 'A' as abc from dual
union
select 2 as id , 'B' as abc from dual
union
select 3 as id , 'C' as abc from dual
union
select 4 as id , 'D' as abc from dual
union
select 5 as id , 'E' as abc from dual
union
select 6 as id , 'F' as abc from dual
),
cte2 as --table2
(select 1 as id, 3 as name from dual
union
select 1 as id, 5 as name from dual
union
select 1 as id, 4 as name from dual
union
select 2 as id, 3 as name from dual
union
select 2 as id, 6 as name from dual
)
SELECT e.id, e.abc, m.id as mgr, e.abc, c.*
FROM
cte1 e, cte2 m, cte2 c
WHERE e.id = m.id
and
e.id=c.name;
You are trying to join each row in table 1 to two rows in table 2, and the conditions can never both be true.
You want to join each row in table 2 to two rows in table 1:
SELECT e.abc, m.abc
FROM cte2 c, cte1 e, cte1 m
WHERE e.id = c.id
AND m.id = c.name
ORDER BY c.id, c.name;
A A
- -
A C
A D
A E
B C
B F
or with 'modern' join syntax, which you should really be using:
SELECT e.abc, m.abc
FROM cte2 c
JOIN cte1 e ON e.id = c.id
JOIN cte1 m ON m.id = c.name
ORDER BY c.id, c.name;
A A
- -
A C
A D
A E
B C
B F

Recursive calculation to form a tree using sql

I am working on a simple problem and wanted to solve it using SQL. I am having 3 tables Category, Item & a relational table CategoryItem. I need to return count of items per category but the twist is Categories are arranged in Parent-Child relationships and the count of items in child categories should be added to the count in its parent Category. Please consider the sample data below and the expected resultset using SQL.
Id Name ParentCategoryId
1 Category1 Null
2 Category1.1 1
3 Category2.1 2
4 Category1.2 1
5 Category3.1 3
ID CateoryId ItemId
1 5 1
2 4 2
3 5 2
4 3 1
5 2 3
6 1 1
7 3 2
Result:
CategoryNAme Count
Category1 7
Category1.1 5
Category2.1 4
Category1.2 1
Category3.1 2
I can do it in my business layer but performance its not optimal because of size of data. I am hoping if I can do it in data layer, I would be able to improve performance greatly.
Thanks in Advance for your reply
your tables and sample data
create table #Category(Id int identity(1,1),Name Varchar(255),parentId int)
INSERT INTO #Category(Name,parentId) values
('Category1',null),('Category1.1',1),('Category2.1',2),
('Category1.2',1),('Category3.1',3)
create table #CategoryItem(Id int identity(1,1),categoryId int,itemId int)
INSERT INTO #CategoryItem(categoryId,itemId) values
(5,1),(4,2),(5,2),(3,1),(2,3),(1,1),(3,2)
create table #Item(Id int identity(1,1),Name varchar(255))
INSERT INTO #Item(Name) values('item1'),('item2'),('item3')
Checking for all childs of parent by Recursive Commom Table Expressions
;WITH CategorySearch(ID, parentId) AS
(
SELECT ID, ID AS ParentId FROM #Category
UNION ALL
SELECT CT.Id,CS.parentId FROM #Category CT
INNER JOIN CategorySearch CS ON CT.ParentId = CS.ID
)
select * from CategorySearch order by 1,2
Output: All child records against parent
ID parentId
1 1
2 1
3 1
4 1
5 1
2 2
3 2
5 2
3 3
5 3
4 4
5 5
Final query for your result, count all items for category and its children categories.
;WITH CategorySearch(ID, parentId) AS
(
SELECT ID, ID AS ParentId FROM #Category
UNION ALL
SELECT CT.Id,CS.parentId FROM #Category CT
INNER JOIN CategorySearch CS ON CT.ParentId = CS.ID
)
SELECT CA.Name AS CategoryName,count(itemId) CountItem
FROM #Category CA
INNER JOIN CategorySearch CS ON CS.ParentId = CA.id
INNER JOIN #CategoryItem MI ON MI.CategoryId =CS.ID
GROUP BY CA.Name
Output:
CategoryName CountItem
Category1 7
Category1.1 5
Category1.2 1
Category2.1 4
Category3.1 2
with help of CTE (common table expression) with recursion, you could achieve what you are looking for.
see Microsoft help for more details retated to recursive CTEs: CTE MS SQL 2008 +
hereby you could find complete example, with your sample data:
-- tables definition
SELECT 1 as id, 'cat1' as [name],NULL as id_parent
into cat
union
select 2, 'cat1.1', 1
union
select 3, 'cat2.1', 2
union
select 4, 'cat1.2', 1
union
select 5, 'cat3.1', 3
select 1 as id , 5 as id_cat, 1 as id_item
iNTO item
UNION
select 2, 4, 2
UNION
select 3, 5, 2
UNION
select 4, 3, 1
UNION
select 5, 2, 3
UNION
select 6, 1, 1
UNION
select 7, 3, 2
-- CTE to get desired result
with childs
as
(
select c.id, c.id_parent
from cat c
UNION ALL
select s.id, p.id_parent
from cat s JOIN childs p
ON (s.id_parent=p.id)
),
category_count
AS
(
SELECT c.id, c.name, count(i.id) as items
from cat c left outer join item i
on (c.id=i.id_cat)
GROUP BY c.id,c.name
),
pairs
AS
(
SELECT id, ISNULL(id_parent,id) as id_parent
FROM childs
)
select p.id_parent, n.name, sum(items)
from pairs p JOIN category_count cc
ON (p.id=cc.id)
join cat n ON (p.id_parent=n.id)
GROUP by p.id_parent ,n.name
ORDER by 1;

TSQL Finding Incomplete courses

From the StudentDetails Table
StudentDetails
SID Name CourseCompleted
1 Andrew CS001
1 Andrew CS002
1 Andrew CS003
2 Grey CS001
2 Grey CS005
2 Grey CS002
3 Jon CS002
3 Jon CS005
3 Jon CS008
How to generate the folowing output ( Course not completed by each student)
SID Name Course Not Completed
1 Andrew CS005
1 Andrew CS008
2 Grey CS003
2 Grey CS008
3 Jon CS001
3 Jon CS003
select distinct a.SID, a.Name, b.CourseCompleted as `Course Not Completed`
from StudentDetails a,
(select distinct CourseCompleted from StudentDetails) b
where not exists
(select 1 from StudentDetails where SID = a.SID and CourseCompleted = b.CourseCompleted)
order by a.SID
select s.SID, s.Name, c.Course as [Course Not Completed]
from (select distinct CourseCompleted [Course] from StudentDetails) c,
StudentDetails s
where not exists (
select * from StudentDetails where SID=s.SID and CourseCompleted=c.Course
)
Of course, if you have a table listing all possible courses, you could replace the subquery in the from clause with that table.
With StudentDetails As
(
SELECT 1 SID, 'Andrew' Name, 'CS001' CourseCompleted union all
SELECT 1, 'Andrew', 'CS002' union all
SELECT 1 , 'Andrew' , 'CS003' union all
SELECT 2 , 'Grey' , 'CS001' union all
SELECT 2 , 'Grey' , 'CS005' union all
SELECT 2 , 'Grey' , 'CS002' union all
SELECT 3 , 'Jon' , 'CS002' union all
SELECT 3 , 'Jon' , 'CS005' union all
SELECT 3 , 'Jon' , 'CS008'
),
Courses AS
(
SELECT DISTINCT CourseCompleted AS Course
FROM StudentDetails
),
Students As
(
SELECT DISTINCT SID, Name
FROM StudentDetails
)
SELECT s.SID, s.name, c.Course AS [Course not Completed] FROM Students s
CROSS JOIN Courses c
EXCEPT
SELECT SID, name, CourseCompleted
FROM StudentDetails
ORDER BY s.SID, c.Course