Attempting to get a table from SQL Request - sql

I'm working on a project right now and I need to do some request to my DB via SQL *PLUS.
Here is what I'm trying to do.
I want to get a table in which I get Professor first and last name with those conditons (I have to verify the first condition, and then the other):
(First) In a session (let's say 12004), a prof did teach those two courses, INF3180 and INF2110
(Second) In another session, 32003, a prof did teach those two courses, INF1130 and INF1110
Here is the code that created the DB:
CREATE TABLE Professor
(professorCode CHAR(5) NOT NULL,
lastName VARCHAR(10) NOT NULL,
firstName VARCHAR(10) NOT NULL,
CONSTRAINT PrimaryKeyProfessor PRIMARY KEY (professorCode)
)
;
CREATE TABLE Group
(sigle CHAR(7) NOT NULL,
noGroup INTEGER NOT NULL,
sessionCode INTEGER NOT NULL,
maxInscriptions INTEGER NOT NULL,
professorCode CHAR(5) NOT NULL,
CONSTRAINT PrimaryKeyGroup PRIMARY KEY
(sigle,noGroupe,sessionCode),
CONSTRAINT CESigleGroupeRefCours FOREIGN KEY (sigle) REFERENCES Cours,
CONSTRAINT CECodeSessionRefSession FOREIGN KEY (sessionCode) REFERENCES
Session,
CONSTRAINT CEcodeProfRefProfessor FOREIGN KEY(professorCode) REFERENCES
Professor
)
;
And here is my current not working request :
SELECT DISTINCT Professor.firstName, Professor.lastName
FROM Professor, Group
WHERE Group.professorCode = Professor.professorCode
AND Group.sessionCode = 32003
AND (Group.sigle = 'INF1130' AND
Group.sigle = 'INF1110')
OR Group.sessionCode = 12004
AND (Group.sigle = 'INF3180' AND
Group.sigle = 'INF2110')
I know there is a way to combine both results, but I can't seem to find it.
There is only one match possible in that case :
Only one match with 32003 : INF1130, INF1110
None match with 12004 : INF3180, INF2110
The resulting table is supposed to look like this :
--------------------------
First Name Last Name
--------------------------
Denis Tremblay
The proposed solution given by Gordon Linoff looks very good, except it returns me no table since with the following the code, it needs to have the 4 courses and 2 sessionCode to be included. The issue here is that it needs to verify both condition and append the result. Let's say the conditions for the session 12004 results to nothing, then I can consider it as NULL. Then, the second condition, with the session 32003, gives me one match. It should append both results to give me the table presented over.
I want to do one request only for this.
Thanks A LOT!
EDIT : Reformulated
EDIT2 : Gave an example of a known match
EDIT3 : Further explanation why the proposed solution isn't working

Think: group by and having. More importantly, think JOIN, JOIN, JOIN. Never use commas in the from clause.
SELECT p.firstName, p.lastName
FROM Professor p JOIN
Group g
ON g.professorCode = p.professorCode
WHERE (g.sessionCode, g.sigle) IN ( (32003, 'INF1130'), (32003, 'INF1110'),
(12004, 'INF3180'), (12004, 'INF2110')
)
GROUP BY p.firstName, p.lastName
HAVING COUNT(DISTINCT g.sigl) = 4; -- has all four

It seems like you want to list any professor who either taught INF1130 and INF1110 in 32003; or taught INF3180 and INF2110 in 12004. Unfortunately you've presented that as AND (i.e. they have to have taught all four courses - one pair of courses AND the other), not OR (one set of courses OR the other).
As a long-winded way of expanding what I think you want:
SELECT p.firstName, p.lastName
FROM Professor p
WHERE (
EXISTS (
SELECT *
FROM GroupX g
WHERE professorCode = p.professorCode
AND sessionCode = 32003
AND sigle = 'INF1130'
)
AND EXISTS (
SELECT *
FROM GroupX g
WHERE professorCode = p.professorCode
AND sessionCode = 32003
AND sigle = 'INF1110'
)
)
OR (
EXISTS (
SELECT *
FROM GroupX g
WHERE professorCode = p.professorCode
AND sessionCode = 12004
AND sigle = 'INF3180'
)
AND EXISTS (
SELECT *
FROM GroupX g
WHERE professorCode = p.professorCode
AND sessionCode = 12004
AND sigle = 'INF2110'
)
);
Four subqueries isn't going to be terribly efficient. You could do mutiple joins instead.
If you will always be looking for two sigle values per sessionCode then you could modify Gordon's answer to count how many matches each sigle, by adding that to the group-by clause:
SELECT p.firstName, p.lastName
FROM GroupX g
JOIN Professor p
ON p.professorCode = g.professorCode
WHERE (g.sessionCode, g.sigle) IN ( (32003, 'INF1130'), (32003, 'INF1110'),
(12004, 'INF3180'), (12004, 'INF2110')
)
GROUP BY p.firstName, p.lastName, g.sessionCode
HAVING COUNT(*) = 2;
If you did have a professor who taught all four then you would get them listed twice; if that can happen you could add your DISTINCT back in, though that feels a bit wrong. You could also use a subquery and IN to avoid that:
SELECT p.firstName, p.lastName
FROM Professor p
WHERE ProfessorCode IN (
SELECT professorCode
FROM GroupX
WHERE (sessionCode, sigle) IN ( (32003, 'INF1130'), (32003, 'INF1110'),
(12004, 'INF3180'), (12004, 'INF2110')
)
GROUP BY professorCode, sessionCode
HAVING COUNT(*) = 2
)
(I've changed Group to GroupX because that isn't a valid identifier; because it's a keyword. I assume you've changed your real names - maybe from another language?)

use modern join
SELECT Professor.firstName, Professor.lastName
FROM Professor join "Group" g on
g.professorCode = Professor.professorCode
where g.sessionCode in( 32003,12004 )
AND g.sigle in( 'INF1130', 'INF1110','INF3180','INF2110')
group by Professor.firstName, Professor.lastName
having count( distinct sigle )=4

Related

Matching similar entities based on many to many relationship

I have two entities in my database that are connected with a many to many relationship. I was wondering what would be the best way to list which entities have the most similarities based on it?
I tried doing a count(*) with intersect, but the query takes too long to run on every entry in my database (there are about 20k records). When running the query I wrote, CPU usage jumps to 100% and the database has locking issues.
Here is some code showing what I've tried:
My tables look something along these lines:
/* 20k records */
create table Movie(
Id INT PRIMARY KEY,
Title varchar(255)
);
/* 200-300 records */
create table Tags(
Id INT PRIMARY KEY,
Desc varchar(255)
);
/* 200,000-300,000 records */
create table TagMovies(
Movie_Id INT,
Tag_Id INT,
PRIMARY KEY (Movie_Id, Tag_Id),
FOREIGN KEY (Movie_Id) REFERENCES Movie(Id),
FOREIGN KEY (Tag_Id) REFERENCES Tags(Id),
);
(This works, but it is terribly slow)
This is the query that I wrote to try and list them:
Usually I also filter with top 1 & add a where clause to get a specific set of related data.
SELECT
bk.Id,
rh.Id
FROM
Movies bk
CROSS APPLY (
SELECT TOP 15
b.Id,
/* Tags Score */
(
SELECT COUNT(*) FROM (
SELECT x.Tag_Id FROM TagMovies x WHERE x.Movie_Id = bk.Id
INTERSECT
SELECT x.Tag_Id FROM TagMovies x WHERE x.Movie_Id = b.Id
) Q1
)
as Amount
FROM
Movies b
WHERE
b.Id <> bk.Id
ORDER BY Amount DESC
) rh
Explanation:
Movies have tags and the user can get try to find movies similar to the one that they selected based on other movies that have similar tags.
Hmm ... just an idea, but maybe I didnt understand ...
This query should return best matched movies by tags for a given movie ID:
SELECT m.id, m.title, GROUP_CONCAT(DISTINCT t.Descr SEPARATOR ', ') as tags, count(*) as matches
FROM stack.Movie m
LEFT JOIN stack.TagMovies tm ON m.Id = tm.Movie_Id
LEFT JOIN stack.Tags t ON tm.Tag_Id = t.Id
WHERE m.id != 1
AND tm.Tag_Id IN (SELECT Tag_Id FROM stack.TagMovies tm WHERE tm.Movie_Id = 1)
GROUP BY m.id
ORDER BY matches DESC
LIMIT 15;
EDIT:
I just realized that it's for M$ SQL ... but maybe something similar can be done...
You should probably decide on a naming convention and stick with it. Are tables singular or plural nouns? I don't want to get into that debate, but pick one or the other.
Without access to your database I don't know how this will perform. It's just off the top of my head. You could also limit this by the M.id value to find the best matches for a single movie, which I think would improve performance by quite a bit.
Also, TOP x should let you get the x closest matches.
SELECT
M.id,
M.title,
SM.id AS similar_movie_id,
SM.title AS similar_movie_title,
COUNT(*) AS matched_tags
FROM
Movie M
INNER JOIN TagsMovie TM1 ON TM1.movie_id = M.movie_id
INNER JOIN TagsMovie TM2 ON
TM2.tag_id = TM1.tag_id AND
TM2.movie_id <> TM1.movie_id
INNER JOIN Movie SM ON SM.movie_id = TM2.movie_id
GROUP BY
M.id,
M.title,
SM.id AS similar_movie_id,
SM.title AS similar_movie_title
ORDER BY
COUNT(*) DESC

refer to array from SELECT clause in WHERE clause

I have the following two tables:
create table person
(
identifier integer not null,
name text not null,
age integer not null,
primary key(identifier)
);
create table agenda
(
identifier integer not null,
name text not null,
primary key(identifier)
);
They are joined with the following table:
create table person_agenda
(
person_identifier integer not null,
agenda_identifier integer not null,
primary key(person_identifier, agenda_identifier),
foreign key(person_identifier) references person(identifier),
foreign key(agenda_identifier) references agenda(identifier)
);
I am trying to refer to an array, as definied in the SELECT clause, in the WHERE clause.
The following works:
select identifier, name, array(select identifier from agenda a, person_agenda pa where person_identifier = p.identifier and identifier = agenda_identifier and name = '...') as r
from person p;
This does not:
select identifier, name, array(select identifier from agenda a, person_agenda pa where person_identifier = p.identifier and identifier = agenda_identifier and name = '...') as r
from person p
where array_length(r, 1) >= 1;
It says that r is not a known column. How can I refer to this array in the WHERE clause?
The purpose of my second query is to:
omit persons without agendas (by filtering on array_length() >= 1)
get all agenda identifiers, so I can fetch their information in a subsequent query without having to filter again (on field agenda.name in my example above) (by projecting the array in the SELECT clause)
The first bullet can be done with a simple join. But, for the first bullet in combination with the second bullet, I need some kind of aggregation on the agenda identifiers. I thought arrays would be useful for this.
edit
According to user Saba, this is not possible. Thanks for your feedback.
Is the following query a good alternative?
select person.identifier, person.name, array_agg(agenda.identifier)
from person, person_agenda, agenda
where person.identifier = person_identifier and
agenda.identifier = agenda_identifier and
agenda.name = '...'
group by person.identifier;
Alias cant be used in WHERE clause, if you want to use the alias in your WHERE clause, you need to wrap it in a subquery or CTE. Because query will be executed in the following order:
FROM
ON
JOIN
**WHERE**
GROUP BY
WITH CUBE or WITH ROLLUP
HAVING
**SELECT**
DISTINCT
ORDER BY
TOP
Try something like this:
SELECT
FROM (
select identifier, name, array(select identifier from agenda a, person_agenda pa where person_identifier = p.identifier and identifier = agenda_identifier and name = '...') as r
from person p
) AS per
where array_length(r) >= 1;
Your case seems straight forward, the below one should work for you:
select person.identifier, person.name, array_agg(agenda.identifier)
from person, person_agenda, agenda
where person.identifier = person_identifier and
agenda.identifier = agenda_identifier and
agenda.name = '...'
group by person.identifier, person.name;

Connecting 4 tables

I am only a beginner in SQL, and I have problem that I can not solve.
The problem is the following:
i have four tables
Student: matrnr, name, semester, start_date
Listening: matrnr<Student>, vorlnr<Subject>
Subject: vorlnr, title, sws, teacher<Professor>
Professor: persnr, name, rank, room
I need to list all the students that are listening the Subject of some Professor with samo name.
EDIT:
select s.*
from Student s, Listening h
where s.matrnr=h.matrnr
and h.vorlnr in (select v.vorlnr from Subject v, Professor p
where v.gelesenvon=p.persnr and p.name='Kant');
This is how i solved it but i am not sure is it optimal solution.
Your approach is good. Only, you want to show students, but join students with listings thus getting student-listing combinations.
Moreover you use a join syntax that is out-dated. It was replaced more than twenty years ago with explicit joins (INNER JOIN, CROSS JOIN, etc.)
You can do it with subqueries only:
select *
from Students,
where matrnr in
(
select matrnr
from Listening
where vorlnr in
(
select vorlnr
from Subject
where gelesenvon in
(
select persnr
from Professor
where name='Kant'
)
)
);
Or join the other tables:
select *
from Students
where matrnr in
(
select l.matrnr
from Listening l
inner join Subject s on s.vorlnr = l.vorlnr
inner join Professor p on p.persnr = s.gelesenvon and p.name='Kant'
);
Or with EXISTS:
select *
from Students s
where exists
(
select *
from Listening l
inner join Subject su on su.vorlnr = l.vorlnr
inner join Professor p on p.persnr = su.gelesenvon and p.name='Kant'
where l.matrnr = s.matrnr
);
Some people like to join everthing and then clean up in the end using DISTINCT. This is easy to write, especially as you don't have to think your query through at first. But for the same reason it can get complicated when more tables and more logic are involved (like aggregations) and it can become quite hard to read, too.
select distinct s.*
from Students s
inner join Listening l on l.matrnr = s.matrnr
inner join Subject su on su.vorlnr = l.vorlnr
inner join Professor p on p.persnr = su.gelesenvon and p.name='Kant';
At last it is a matter of taste.
When you have an SQL problem, a good way of presenting the problem is to show us the tables as CREATE TABLE statements. Such statements show details such as the types of the columns and which columns are primary keys. Additionally this allows us to actually build a little database in order to reproduce a faulty behavior or just to test our solutions.
CREATE TABLE Student
(
matrnr NUMBER(9) PRIMARY KEY,
name NVARCHAR2(50),
semester NUMBER(2),
start_date DATE
);
CREATE TABLE Listening
(
matrnr NUMBER(9), -- Student
vorlnr NUMBER(9), -- Subject
CONSTRAINT PK_Listening PRIMARY KEY (matrnr, vorlnr)
);
CREATE TABLE Subject
(
vorlnr NUMBER(9) PRIMARY KEY,
title NVARCHAR2(50),
sws NVARCHAR2(50),
teacher NUMBER(9) -- Professor
);
CREATE TABLE Professor
(
persnr NUMBER(9) PRIMARY KEY,
name NVARCHAR2(50),
rank NUMBER(3),
room NVARCHAR2(50)
);
Using this schema, my solution would look like this:
SELECT *
FROM
Student
WHERE
matrnr IN (
SELECT L.matrnr
FROM
Listening L
INNER JOIN Subject S
ON L.vorlnr = S.vorlnr
INNER JOIN Professor P
ON S.teacher = P.persnr
WHERE P.name = 'Kant'
);
You can find it here: http://sqlfiddle.com/#!4/5179dc/2
Since I didn't insert any records, the only thing it is testing is the syntax and the correct use of table and column names.
Your solution is suboptimal. It does not differentiate between joining of tables and additional conditions specified as where-clause. It can produce several result records per student if they attend several courses of the professor. Therefore my solution puts all the other tables into the sub-select.
select st.name
from student st
join listening l on l.matrnr = st.matrnr
join subject su on su.vorlnr = l.vorlnr
join professor p on su.teacher = p.persnr
where p.name = 'some name'
SELECT *
FROM student
INNER JOIN listening ON student.matrnr = listening.matrnr
INNER JOIN subject ON listening.vorlnr = subject.vorlnr
INNER JOIN professor ON subject.teacher = professor.name
WHERE professor.name = 'some name'

How to do I get the name of subjects pre-requisites?

Here's a rough sketch. I have a pre-requisite table and subject table.
I have a rough idea how I can list the subject code. But I am really unsure on how I can get a query that can list out the name and details of the subject and it's pre requisites.
For example, I would like to write a query that will list out the subjects names and its pre requisite names.
So the resultant would come out as (Well I'll do the concatenating texts later):
"Introduction to Computer is a pre-requsite of Operating Systems".
I'm just wondering how do I extract the names of subjects off these two tables?
CREATE TABLE subjects (
subject_code VARCHAR(7) NOT NULL CONSTRAINT subject_pk PRIMARY KEY,
subject_name VARCHAR(50) NOT NULL,
subject_details TEXT NOT NULL
);
CREATE TABLE SubjectPrerequisite
( Primary_Subject_Code VARCHAR(7) NOT NULL,
Prerequisite_Subject_Code VARCHAR(7) NOT NULL,
CONSTRAINT PK_SubjectPrerequisite PRIMARY KEY (Primary_Subject_Code, Prerequisite_Subject_Code),
CONSTRAINT FK_SubjectPrerequisite_Primary_Subject_Code FOREIGN KEY (Primary_Subject_Code) REFERENCES Subject (Subject_Code),
CONSTRAINT FK_SubjectPrerequisite_Prerequisite_Subject_Code FOREIGN KEY (Prerequisite_Subject_Code) REFERENCES Subject (Subject_Code)
)
//EDIT: Here's what I have so far
SELECT subject_name
FROM SubjectPreRequisite t0
INNER JOIN subjects t1
ON t0.subject_code = s1.prerequisite_subject_code
Assuming you want the total list of subject names, do this query:
select subject_name from subjects
Assuming you want the subjects pre requisitites and subject_code has a relation with Primary_Subject_Code, do thus query:
select s.subject_name, r.Prerequisite_Subject_Code
from subjects s
inner join SubjectPrerequisite r on s.subject_code = r.Primary_Subject_Code
And with your concat:
select r.Prerequisite_Subject_Code ' + is a pre-requisite of ' + s.subject_name as 'Pre-Requisites'
from subjects s
inner join SubjectPrerequisite r on s.subject_code = r.Primary_Subject_Code
I assume (perhaps wrongly) you are looking to concatenate the subject names of the prerequisites into a single row. Below is a SQL Server example of how this can be done:
;WITH Prerequisites AS
( SELECT Primary_Subject_Code, Subject_Name
FROM SubjectPrerequisite
INNER JOIN Subjects
ON Subject_Code = Prerequisite_Subject_Code
)
SELECT Subject_Code,
Subject_Name,
Subject_Details,
STUFF( ( SELECT ',' + Subject_Name
FROM Prerequisites
WHERE Primary_Subject_Code = Subject_Code
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)'), 1, 1, '') AS PrerequisiteList
FROM Subjects
I have previously given a full explanation of how the XML PATH Method works here. With a further improvement to my answer pointed out here
See also SQL Server - Possible Pivot Solution?
If you just want to extract the pre-prequsite relationships to 1 level of recursion you could do
SELECT
[original].[subject_code] [OriginalCode]
, [prereq].[subject_code] [Pre-RequisiteCode]
FROM
[subject] [orginal]
LEFT JOIN
[SubjectPrerequisite] [spr]
ON [spr].[Primary_Subject_Code] = [original].[subject_code]
JOIN
[subject] [prereq]
ON [prereq].[subject_code] = [spr].[Prerequisite_Subject_Code]
ORDER BY
[OriginalCode]
, [Pre-RequisiteCode]
If you want to show the recursive chain and concatenate the subjects in some way then a CTE like GarethD's answer is the way to go. However, I suggest doing that with SQL would be wrong for an n-tier application.
Think I solved it:
SELECT t.subject_name + 'is a pre-requisite of' + s.subject_name
FROM subjects s
INNER JOIN pre_requisites r ON s.subject_code = r.subject_code
INNER JOIN subjects t ON t.subject_code = r.pre_requisite_code

how can i rewrite a select query in this situation

Here are two table in parent/child relationship.
What i need to do is to select students with there average mark:
CREATE TABLE dbo.Students(
Id int NOT NULL,
Name varchar(15) NOT NULL,
CONSTRAINT PK_Students PRIMARY KEY CLUSTERED
(
CREATE TABLE [dbo].[Results](
Id int NOT NULL,
Subject varchar(15) NOT NULL,
Mark int NOT NULL
)
ALTER TABLE [dbo].[Results] WITH CHECK ADD CONSTRAINT [FK_Results_Students] FOREIGN KEY([Id])
REFERENCES [dbo].[Students] ([Id])
I wrote a query like this :
SELECT name , coalesce(avg(r.[mark]),0) as Avmark
FROM students s
LEFT JOIN results r ON s.[id]=r.[id]
GROUP BY s.[name]
ORDER BY ISNULL(AVG(r.[mark]),0) DESC;
But the result is that all of students with there avg mark in desc order.What i need is to restrict result set with students that have the highest average mark agaist other,i.e.if the are two students with avg mark 50 and 1 with 25 i need to display only those students with 50.If there are only one student with highest avg mark- only he must appear in result set.How can i do this in best way?
SQL Server 2005+, using CTEs:
WITH grade_average AS (
SELECT r.id,
AVG(r.mark) 'avg_mark'
FROM RESULTS r
GROUP BY r.id),
highest_average AS (
SELECT MAX(ga.avg_mark) 'highest_avg_mark'
FROM grade_average ga)
SELECT DISTINCT
s.name,
ga.avg_mark
FROM STUDENTS s
JOIN grade_average ga ON ha.id = s.id
JOIN highest_average ha ON ha.highest_avg_mark = ga.avg_mark
Non-CTE equivalent:
SELECT DISTINCT
s.name,
ga.avg_mark
FROM STUDENTS s
JOIN (SELECT r.id,
AVG(r.mark) 'avg_mark'
FROM RESULTS r
GROUP BY r.id) ga ON ha.id = s.id
JOIN SELECT MAX(ga.avg_mark) 'highest_avg_mark'
FROM (SELECT r.id,
AVG(r.mark) 'avg_mark'
FROM RESULTS r
GROUP BY r.id) ga) ha ON ha.highest_avg_mark = ga.avg_mark
If you're using a relatively new version of MS SQL server, you can use WITH to make this simple to write:
WITH T AS (
SELECT
name,
coalesce(avg(r.[mark]),0) as mark
FROM students s
LEFT JOIN results r ON s.[id]=r.[id]
GROUP BY s.[name])
SELECT name as 'ФИО', mark as 'Средний бал'
FROM T
WHERE T.mark = (SELECT MAX(mark) from T)
Is it as simple as this? For all versions of SQL Server 2000+
SELECT TOP 1 WITH TIES
name, ISNULL(avg(r.[mark]),0) as AvMark
FROM
students s
LEFT JOIN
results r ON s.[id]=r.[id]
GROUP BY
s.[name]
ORDER BY
ISNULL(avg(r.[mark]),0) DESC;
SELECT name as 'ФИО',
coalesce(avg(r.[mark]),0) as 'Средний бал'
FROM students s
LEFT JOIN results r
ON s.[id]=r.[id]
GROUP BY s.[name]
HAVING AVG(r.[mark]) >= 50
ORDER BY ISNULL(AVG(r.[mark]),0) DESC
about HAVING clause