SQL relationships for unique sets of rows - sql

I am trying to set up a relationship between a couple of tables where a unique set of rows in one table relate to a row in another table.
I have came up with a scenario to reflect what I am trying to accomplish.
In this scenario, we are trying to determine the role(s) that a new hire should be given, based on the set of skills that they posses. An employee can be given multiple roles. For example, a software engineer with management experience is given both the Software Engineer and the Tech Lead roles. However, the roles given must line up exactly with a given skill set. If a new hire comes in with every skill we are looking for, we give them the CTO role. The CTO posses all of the skills for both the Software Engineer and Tech Lead roles, but they are not given those roles.
I believe my issue boils down to the skill_set relationship, where I am trying to tie a unique set of rows from the skill table to a specific skill_set. Any given skill can be in many skill_sets, but when querying for a skill_set, I only want to return the skill_set that contains all of the skills, but currently I don't know of a good way to query for that specific skill_set
We don't need to worry about trying to find roles for lists of skills that aren't valid skill_sets. Those can return no role.
Note: This schema is not set in stone. Changing it is definitely an option, so if I have modeled this incorrectly, we can fix that.
CREATE TABLE IF NOT EXISTS `skill` (
`id` int(6) unsigned NOT NULL,
`name` varchar(16) NOT NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `skill_set` (
`id` int(6) unsigned NOT NULL,
`skill_id` int(6) unsigned NOT NULL,
PRIMARY KEY (`id`, `skill_id`)
) DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `default_role` (
`skill_set_id` int(6) unsigned NOT NULL,
`role_id` int(6) unsigned NOT NULL,
PRIMARY KEY (`skill_set_id`, `role_id`)
) DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `role` (
`id` int(6) unsigned NOT NULL,
`name` varchar(32) NOT NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
INSERT INTO `skill` (`id`, `name`) VALUES
('1', 'python'),
('2', 'javascript'),
('3', 'ec2'),
('4', 'docker'),
('5', 'management');
INSERT INTO `skill_set` (`id`, `skill_id`) VALUES
('1', '1'),
('2', '2'),
('3', '1'),
('3', '2'),
('4', '3'),
('5', '4'),
('6', '1'),
('6', '2'),
('6', '5'),
('7', '3'),
('7', '4'),
('7', '5'),
('8', '1'),
('8', '2'),
('8', '3'),
('8', '4'),
('8', '5');
INSERT INTO `default_role` (`skill_set_id`, `role_id`) VALUES
('1', '1'),
('2', '1'),
('3', '2'),
('4', '3'),
('5', '3'),
('6', '2'),
('6', '4'),
('7', '3'),
('7', '4'),
('8', '5');
INSERT INTO `role` (`id`, `name`) VALUES
('1', 'Junior Software Engineer'),
('2', 'Software Engineer'),
('3', 'DevOps Engineer'),
('4', 'Tech Lead'),
('5', 'CTO');
A SQL fiddle is also available: http://sqlfiddle.com/#!9/86bcfe0
Some example outputs:
Given the skills: ['python']
Return the default role: Junior Software Engineer
Given the skills: ['python', 'javascript']
Return the default role: Software Engineer
Given the skills: ['ec2']
Return the default role: DevOps Engineer
Given the skills: ['python', 'javascript', 'management']
Return the default roles: Software Engineer, Tech Lead
Given the skills: ['python', 'javascript', 'ec2', 'docker', 'management']
Return the default role: CTO

Related

How can I join tables using information from different rows?

I have two similar tables that I would like to join. See reproducible example below.
WHAT NEEDS TO BE DONE
See comments in code: concatenating the values '2021-01-01'(column: Date), 'hat'(column: content), 'cat'(column: content) and 'A'(column: Tote) in first_table would lead to a unique key that can be joined with the exact same data in second_table. The result would be the first row of the 4 unique events (see desired_result: '#first tote'). In reality the rows would be a few million.
Reproducible example:
CREATE OR REPLACE TABLE
`first_table` (
`Date` string NOT NULL,
`TotearrivalTimestamp` string NOT NULL,
`Tote` string NOT NULL,
`content` string NOT NULL,
`location` string NOT NULL,
);
INSERT INTO `first_table` (`Date`, `TotearrivalTimestamp`, `Tote`, `content`, `location`) VALUES
('2021-01-01', '13:00','A','hat','1'), #first tote
('2021-01-01', '13:00','A','cat','1'), #first tote
('2021-01-01', '14:00', 'B', 'toy', '1'),
('2021-01-01', '14:00', 'B', 'cat', '1'),
('2021-01-01', '15:00', 'A', 'toy', '1'),
('2021-01-01', '13:00', 'A', 'toy', '1'),
('2021-01-02', '13:00', 'A', 'hat', '1'),
('2021-01-02', '13:00', 'A', 'cat', '1');
CREATE OR REPLACE TABLE
`second_table` (
`Date` string NOT NULL,
`ToteendingTimestamp` string NOT NULL,
`Tote` string NOT NULL,
`content` string NOT NULL,
`location` string NOT NULL,
);
INSERT INTO `second_table` (`Date`, `ToteendingTimestamp`, `Tote`, `content`, `location`) VALUES
('2021-01-01', '20:00', 'B', 'cat', '2'),
('2021-01-01', '19:00', 'A', 'cat', '1'), #first tote
('2021-01-01', '19:00', 'A', 'hat', '1'), #first tote
('2021-01-01', '20:00', 'B', 'toy', '2'),
('2021-01-01', '14:00', 'A', 'toy', '1'),
('2021-01-02', '14:00', 'A', 'hat', '1'),
('2021-01-02', '14:00', 'A', 'cat', '1'),
('2021-01-01', '16:00', 'A', 'toy', '1');
CREATE OR REPLACE TABLE
`desired_result` (
`Date` string NOT NULL,
`Tote` string NOT NULL,
`TotearrivalTimestamp` string NOT NULL,
`ToteendingTimestamp` string NOT NULL,
`location_first_table` string NOT NULL,
`location_second_table` string NOT NULL,
);
INSERT INTO `desired_result` (`Date`, `Tote`, `TotearrivalTimestamp`, `ToteendingTimestamp`, `location_first_table`, `location_second_table`) VALUES
('2021-01-01', 'A', '13:00', '19:00', '1', '1'), #first tote
('2021-01-01', 'B', '14:00', '20:00', '1', '1'),
('2021-01-01', 'A', '15:00', '16:00', '1', '2'),
('2021-01-02', 'A', '13:00', '14:00', '1', '1');
#### this does not give what I want####
select first.date as Date, first.tote, first.totearrivaltimestamp, second.toteendingtimestamp, first.location as location_first_table, second.location as location_second_table
from `first_table` first
inner join `second_table` second
on first.tote = second.tote
and first.content = second.content;
I was able to reproduce the'desired_result' table (mostly) with the SQL below. I believe there exists a few typos with the 'insert into' statements. However, I think this meets the intent.
Query:
select
first_table.date as Date,
first_table.tote,
first_table.totearrivaltimestamp,
second_table.toteendingtimestamp,
first_table.location as location_first_table,
second_table.location as location_second_table
from first_table
inner join `second_table`
on first_table.Date = second_table.Date
and first_table.tote = second_table.tote
group by first_table.Date, first_table.TotearrivalTimestamp, first_table.tote;
result:
2021-01-01|A|13:00|19:00|1|1
2021-01-01|B|14:00|20:00|1|2
2021-01-01|A|15:00|19:00|1|1
2021-01-02|A|13:00|14:00|1|1
This result assumes your first table dates will always match for totes/timestamps. The group by function then merges duplicate results. The second table information matches the date and tote of the first table and is appended to the line item.
This answer should work. I think your issue might be with some of your quoting of tables....
select f.'date'
,f.tote
, f.totearrivaltimestamp
, s.toteendingtimestamp
, f.location as location_first_table
, s.location as location_second_table
from first f
,INNER JOIN "second" s on f.'date' = s.'date'
and f.tote = s.tote
and f.content = s.content

Why selecting a single attribute returns less rows than selecting all columns in oracle SQL

The tables created and the queries made are not the primary focus of this question, what confuses me is that why the first query and the second query returns different numbers of rows
drop table Reserves;
drop table Sailors;
drop table Boats;
create table Sailors (
sid char(1) not null,
sname char(1) not null,
rating int,
age int not null,
primary key (sid)
);
create table Boats (
bid char(1) not null,
bname char(1) not null,
color varchar(5),
primary key (bid)
);
create table Reserves (
sid char(1) not null,
bid char(1) not null,
rdate int not null,
primary key (sid, bid, rdate),
foreign key (sid) references Sailors(sid)
on delete cascade,
foreign key (bid) references Boats(bid)
on delete cascade
);
------------------------------------------------------------------------------
-- Insert values
insert into Sailors values ('1', 'q', 90, 24);
insert into Sailors values ('0', 's', 60, 22);
insert into Sailors values ('2', 'd', 80, 20);
insert into Sailors values ('3', 'w', 70, 18);
insert into Sailors values ('4', 'a', 60, 19);
insert into Sailors values ('5', 'l', 80, 17);
insert into Sailors values ('6', 'o', 90, 18);
insert into Sailors values ('7', 'q', 70, 20);
insert into Sailors values ('8', 'd', 60, 16);
insert into Sailors values ('9', 'i', 80, 22);
insert into Boats values ('0', 'U', 'red');
insert into Boats values ('1', 'P', 'red');
insert into Boats values ('2', 'Q', 'blue');
insert into Boats values ('3', 'C', 'green');
insert into Boats values ('4', 'L', 'blue');
insert into Boats values ('5', 'O', 'blue');
insert into Boats values ('6', 'A', 'red');
insert into Boats values ('7', 'C', 'red');
insert into Boats values ('8', 'Y', 'green');
insert into Boats values ('9', 'N', 'blue');
insert into Reserves values ('0', '0', 3);
insert into Reserves values ('0', '1', 2);
insert into Reserves values ('0', '2', 1);
insert into Reserves values ('0', '2', 3);
insert into Reserves values ('1', '0', 4);
insert into Reserves values ('3', '2', 2);
insert into Reserves values ('4', '0', 3);
insert into Reserves values ('4', '0', 1);
insert into Reserves values ('4', '1', 3);
insert into Reserves values ('4', '6', 4);
insert into Reserves values ('4', '7', 1);
insert into Reserves values ('5', '8', 2);
insert into Reserves values ('5', '9', 2);
insert into Reserves values ('7', '4', 4);
insert into Reserves values ('7', '5', 1);
insert into Reserves values ('8', '3', 2);
insert into Reserves values ('9', '3', 3);
insert into Reserves values ('9', '0', 1);
insert into Reserves values ('9', '6', 1);
insert into Reserves values ('9', '8', 2);
commit;
select *
from Sailors join Boats on color='red' natural left outer join Reserves
where rdate is null;
select sid
from Sailors join Boats on color='red' natural left outer join Reserves
where rdate is null;
I want to find the sid of the sailors who have not ordered all the red boats, the first query above returns the correct rows I am expecting, nonethless the second query returns only rows with sid=2 and sid=6, despite the two queries are identical. sailors with sid 2 and 6 are the only sailors who have not booked any boat.
As far as I can tell, this looks like a bug in Oracle's implementation of natural [...] join. I will do some testing to see if it affects inner joins too.
Instead of natural join, one can use the syntax left|right|inner join USING(...) and giving the list of column names in the using clause. The list of columns should be the list of ALL columns that have the same name in the two members of the join.
Very simple experimentation with the data you provided (+1 even for that alone) shows that the results are the same as if we had written using(sid, bid) in the first query, but only using(sid) in the second. If in either query - regardless of what it selects, whether * or sid - you use the using syntax, you get the same number of rows in the output as either your first or your second query, depending on what you put in the using clause.
So, what Oracle does for the second query is simply wrong. I can only speculate, but I believe Oracle looks at the SELECT clause first, and perhaps at other clauses, to see what columns it needs to retrieve from each table. (For example, this tells the optimizer what indexes it could use, etc.) And at this step, Oracle decides - in your second query - that it doesn't need bid. Then when it translates "natural" join to its own internal code, it doesn't throw in bid as a join column. Which is wrong - and which is why I called this a "bug".
IMPORTANT NOTE: Others have commented that both queries are "wrong" in that they do not solve the problem you were trying to solve. That may be entirely true. I didn't even look at your problem specification; here I am answering your question, which is valid regardless of the problem - which is, why do the two queries produce different numbers of rows. Even if they are "wrong" for your use case, they should essentially give "the same" wrong answer, not "different" wrong answers. That is the only thing I discussed above.
To be honest, the query is ... errr... not correct
For example it gives you sailors with sid = 0 and sid = 1. These sailors bought red boats
This is SQL you need to display sailors that did not buy a red boat.
select sl.*
from Sailors sl
where not exists (select 1
from Reserves rs
join Boats bs on rs.bid = bs.bid
where bs.color = 'red' and rs.sid = sl.sid);
These 2 queries will give you what you asked for in your last comment. I've UNION'd them together to give you a single list of SIDs but obviously you can run them independently if you want 22 separate lists.
-- Sailors with no reservations
select s.sid
from sailors s
where s.sid not in (
select sid from reserves)
union
-- Sailors with reservations but not of red boats
select s.sid
from sailors s
where s.sid not in (
Select distinct r.sid
from reserves r
inner join boats b on r.bid = b.bid
where b.color = 'red');

SQL query for courses, semester and students

Hi I have a schema that looks like this
I was trying to make these 3 queries
Find the names of the top 4 instructors who have taught the most number of distinct courses. Display also the total number of courses taught.
Output columns: InstructorName, NumberOfCoursesTaught
Sort by: NumberOfCoursesTaught in descending order
Find the top 3 semesters in which the most number of courses were offered. (Treat Spring of 2009 and Spring of 2010 as two different semesters.
Output columns: Semester, Year, NumberOfCourses
Sort by: NumberOfCourses in descending order
Find the top 2 students who have taken the most number of courses.
Output columns: S_ID, StudentName, NumberOfCourses
Sort by: NumberOfCourses in descending order
For query 1 I wrote
Select name AS InstructorName, count(course_id) AS NumberOfCourses
from Teaches where name IN
(SELECT name FROM Instructor where Instructor.i_id = Teaches.i_id)
group by i_id
order by count(course_id) DESC;
For query 2
SELECT semester, year, count(course_id) as
NumberOfCourses from Takes WHERE year='2009'
group by semester, year
order by count(course_id) DESC;
For query 3
SELECT s_id as S_ID, name as StudentName, count(course_id) as NumberOfCourses
FROM Takes where name IN
(SELECT name from Student where Takes.s_id = Student.s_id)
group by s_id
order by count(course_id) DESC;
Query 1 and 3 give the error
ORA-00904: "NAME": invalid identifier
Query 2 is giving an output, but it's wrong. I need help making the 3 queries correct
Test Data is
tables file is
create table classroom (building varchar(15), room_number varchar(7), capacity numeric(4,0), primary key (building, room_number));
create table department (dept_name varchar(20), building varchar(15), budget numeric(12,2) check (budget > 0), primary key (dept_name));
create table course (course_id varchar(8), title varchar(50), dept_name varchar(20), credits numeric(2,0) check (credits > 0),
primary key(course_id));
create table instructor (i_ID varchar(5), name varchar(20) not null, dept_name varchar(20), salary numeric(8,2) check (salary > 29000), primary key (i_ID));
create table section (course_id varchar(8), sec_id varchar(8), semester varchar(6) check (semester in ('Fall', 'Winter', 'Spring', 'Summer')), year numeric(4,0) check (year > 1701 and year < 2100), building varchar(15), room_number varchar(7), time_slot_id varchar(4), primary key (course_id, sec_id, semester, year));
create table teaches (i_ID varchar(5), course_id varchar(8), sec_id varchar(8), semester varchar(6), year numeric(4,0), primary key (i_ID, course_id, sec_id, semester, year));
create table student (s_ID varchar(5), name varchar(20) not null, dept_name varchar(20), tot_cred numeric(3,0) check (tot_cred >= 0), primary key (s_ID));
create table takes (s_ID varchar(5), course_id varchar(8), sec_id varchar(8), semester varchar(6), year numeric(4,0), grade varchar(2), primary key (s_ID, course_id, sec_id, semester, year));
create table advisor (s_ID varchar(5), i_ID varchar(5), primary key (s_ID));
create table time_slot (time_slot_id varchar(4), day varchar(1),start_hr numeric(2) check (start_hr >= 0 and start_hr < 24), start_min numeric(2) check (start_min >= 0 and start_min < 60), end_hr numeric(2) check (end_hr >= 0 and end_hr < 24), end_min numeric(2) check(end_min >= 0 and end_min < 60), primary key (time_slot_id, day, start_hr, start_min));
create table prereq (course_id varchar(8), prereq_id varchar(8), primary key (course_id, prereq_id));
create table grade_points(grade varchar(2), points Number(10,4), primary key (grade));
data file is
delete from prereq;
delete from time_slot;
delete from advisor;
delete from takes;
delete from student;
delete from teaches;
delete from section;
delete from instructor;
delete from course;
delete from department;
delete from classroom;
-- Classroom
insert into classroom values ('Packard', '101', '500');
insert into classroom values ('Painter', '514', '10');
insert into classroom values ('Taylor', '3128', '70');
insert into classroom values ('Watson', '100', '30');
insert into classroom values ('Watson', '120', '50');
-- Department
insert into department values ('Biology', 'Watson', '90000');
insert into department values ('Comp. Sci.', 'Taylor', '100000');
insert into department values ('Elec. Eng.', 'Taylor', '85000');
insert into department values ('Finance', 'Painter', '120000');
insert into department values ('History', 'Painter', '50000');
insert into department values ('Music', 'Packard', '80000');
insert into department values ('Physics', 'Watson', '70000');
-- Course
insert into course values ('BIO-101', 'Intro. to Biology', 'Biology', '4');
insert into course values ('BIO-301', 'Genetics', 'Biology', '4');
insert into course values ('BIO-399', 'Computational Biology', 'Biology', '3');
insert into course values ('CS-101', 'Intro. to Computer Science', 'Comp. Sci.', '4');
insert into course values ('CS-190', 'Game Design', 'Comp. Sci.', '4');
insert into course values ('CS-315', 'Robotics', 'Comp. Sci.', '3');
insert into course values ('CS-319', 'Image Processing', 'Comp. Sci.', '3');
insert into course values ('CS-347', 'Database System Concepts', 'Comp. Sci.', '3');
insert into course values ('EE-181', 'Intro. to Digital Systems', 'Elec. Eng.', '3');
insert into course values ('FIN-201', 'Investment Banking', 'Finance', '3');
insert into course values ('HIS-351', 'World History', 'History', '3');
insert into course values ('MU-199', 'Music Video Production', 'Music', '3');
insert into course values ('PHY-101', 'Physical Principles', 'Physics', '4');
-- Instructor
insert into instructor values ('10101', 'Srinivasan', 'Comp. Sci.', '65000');
insert into instructor values ('12121', 'Wu', 'Finance', '90000');
insert into instructor values ('15151', 'Mozart', 'Music', '40000');
insert into instructor values ('22222', 'Einstein', 'Physics', '95000');
insert into instructor values ('32343', 'El Said', 'History', '60000');
insert into instructor values ('33456', 'Gold', 'Physics', '87000');
insert into instructor values ('45565', 'Katz', 'Comp. Sci.', '75000');
insert into instructor values ('58583', 'Califieri', 'History', '62000');
insert into instructor values ('76543', 'Singh', 'Finance', '80000');
insert into instructor values ('76766', 'Crick', 'Biology', '72000');
insert into instructor values ('83821', 'Brandt', 'Comp. Sci.', '92000');
insert into instructor values ('98345', 'Kim', 'Elec. Eng.', '80000');
-- Section
insert into section values ('BIO-101', '1', 'Summer', '2009', 'Painter', '514', 'B');
insert into section values ('BIO-301', '1', 'Summer', '2010', 'Painter', '514', 'A');
insert into section values ('CS-101', '1', 'Fall', '2009', 'Packard', '101', 'H');
insert into section values ('CS-101', '1', 'Spring', '2010', 'Packard', '101', 'F');
insert into section values ('CS-190', '1', 'Spring', '2009', 'Taylor', '3128', 'E');
insert into section values ('CS-190', '2', 'Spring', '2009', 'Taylor', '3128', 'A');
insert into section values ('CS-315', '1', 'Spring', '2010', 'Watson', '120', 'D');
insert into section values ('CS-319', '1', 'Spring', '2010', 'Watson', '100', 'B');
insert into section values ('CS-319', '2', 'Spring', '2010', 'Taylor', '3128', 'C');
insert into section values ('CS-347', '1', 'Fall', '2009', 'Taylor', '3128', 'A');
insert into section values ('EE-181', '1', 'Spring', '2009', 'Taylor', '3128', 'C');
insert into section values ('FIN-201', '1', 'Spring', '2010', 'Packard', '101', 'B');
insert into section values ('HIS-351', '1', 'Spring', '2010', 'Painter', '514', 'C');
insert into section values ('MU-199', '1', 'Spring', '2010', 'Packard', '101', 'D');
insert into section values ('PHY-101', '1', 'Fall', '2009', 'Watson', '100', 'A');
-- Teaches
insert into teaches values ('10101', 'CS-101', '1', 'Fall', '2009');
insert into teaches values ('10101', 'CS-315', '1', 'Spring', '2010');
insert into teaches values ('10101', 'CS-347', '1', 'Fall', '2009');
insert into teaches values ('12121', 'FIN-201', '1', 'Spring', '2010');
insert into teaches values ('15151', 'MU-199', '1', 'Spring', '2010');
insert into teaches values ('22222', 'PHY-101', '1', 'Fall', '2009');
insert into teaches values ('32343', 'HIS-351', '1', 'Spring', '2010');
insert into teaches values ('45565', 'CS-101', '1', 'Spring', '2010');
insert into teaches values ('45565', 'CS-319', '1', 'Spring', '2010');
insert into teaches values ('76766', 'BIO-101', '1', 'Summer', '2009');
insert into teaches values ('76766', 'BIO-301', '1', 'Summer', '2010');
insert into teaches values ('83821', 'CS-190', '1', 'Spring', '2009');
insert into teaches values ('83821', 'CS-190', '2', 'Spring', '2009');
insert into teaches values ('83821', 'CS-319', '2', 'Spring', '2010');
insert into teaches values ('98345', 'EE-181', '1', 'Spring', '2009');
-- Student
insert into student values ('00128', 'Zhang', 'Comp. Sci.', '102');
insert into student values ('12345', 'Shankar', 'Comp. Sci.', '32');
insert into student values ('19991', 'Brandt', 'History', '80');
insert into student values ('23121', 'Chavez', 'Finance', '110');
insert into student values ('44553', 'Peltier', 'Physics', '56');
insert into student values ('45678', 'Levy', 'Physics', '46');
insert into student values ('54321', 'Williams', 'Comp. Sci.', '54');
insert into student values ('55739', 'Sanchez', 'Music', '38');
insert into student values ('70557', 'Snow', 'Physics', '0');
insert into student values ('76543', 'Brown', 'Comp. Sci.', '58');
insert into student values ('76653', 'Aoi', 'Elec. Eng.', '60');
insert into student values ('98765', 'Bourikas', 'Elec. Eng.', '98');
insert into student values ('98988', 'Tanaka', 'Biology', '120');
-- Takes
insert into takes values ('00128', 'CS-101', '1', 'Fall', '2009', 'A');
insert into takes values ('00128', 'CS-347', '1', 'Fall', '2009', 'A-');
insert into takes values ('12345', 'CS-101', '1', 'Fall', '2009', 'C');
insert into takes values ('12345', 'CS-190', '2', 'Spring', '2009', 'A');
insert into takes values ('12345', 'CS-315', '1', 'Spring', '2010', 'A');
insert into takes values ('12345', 'CS-347', '1', 'Fall', '2009', 'A');
insert into takes values ('19991', 'HIS-351', '1', 'Spring', '2010', 'B');
insert into takes values ('23121', 'FIN-201', '1', 'Spring', '2010', 'C+');
insert into takes values ('44553', 'PHY-101', '1', 'Fall', '2009', 'B-');
insert into takes values ('45678', 'CS-101', '1', 'Fall', '2009', 'F');
insert into takes values ('45678', 'CS-101', '1', 'Spring', '2010', 'B+');
insert into takes values ('45678', 'CS-319', '1', 'Spring', '2010', 'B');
insert into takes values ('54321', 'CS-101', '1', 'Fall', '2009', 'A-');
insert into takes values ('54321', 'CS-190', '2', 'Spring', '2009', 'B+');
insert into takes values ('55739', 'MU-199', '1', 'Spring', '2010', 'A-');
insert into takes values ('76543', 'CS-101', '1', 'Fall', '2009', 'A');
insert into takes values ('76543', 'CS-319', '2', 'Spring', '2010', 'A');
insert into takes values ('76653', 'EE-181', '1', 'Spring', '2009', 'C');
insert into takes values ('98765', 'CS-101', '1', 'Fall', '2009', 'C-');
insert into takes values ('98765', 'CS-315', '1', 'Spring', '2010', 'B');
insert into takes values ('98988', 'BIO-101', '1', 'Summer', '2009', 'A');
insert into takes values ('98988', 'BIO-301', '1', 'Summer', '2010', null);
-- Advisor
insert into advisor values ('00128', '45565');
insert into advisor values ('12345', '10101');
insert into advisor values ('23121', '76543');
insert into advisor values ('44553', '22222');
insert into advisor values ('45678', '22222');
insert into advisor values ('76543', '45565');
insert into advisor values ('76653', '98345');
insert into advisor values ('98765', '98345');
insert into advisor values ('98988', '76766');
-- Time_slot
insert into time_slot values ('A', 'M', '8', '0', '8', '50');
insert into time_slot values ('A', 'W', '8', '0', '8', '50');
insert into time_slot values ('A', 'F', '8', '0', '8', '50');
insert into time_slot values ('B', 'M', '9', '0', '9', '50');
insert into time_slot values ('B', 'W', '9', '0', '9', '50');
insert into time_slot values ('B', 'F', '9', '0', '9', '50');
insert into time_slot values ('C', 'M', '11', '0', '11', '50');
insert into time_slot values ('C', 'W', '11', '0', '11', '50');
insert into time_slot values ('C', 'F', '11', '0', '11', '50');
insert into time_slot values ('D', 'M', '13', '0', '13', '50');
insert into time_slot values ('D', 'W', '13', '0', '13', '50');
insert into time_slot values ('D', 'F', '13', '0', '13', '50');
insert into time_slot values ('E', 'T', '10', '30', '11', '45 ');
insert into time_slot values ('E', 'R', '10', '30', '11', '45 ');
insert into time_slot values ('F', 'T', '14', '30', '15', '45 ');
insert into time_slot values ('F', 'R', '14', '30', '15', '45 ');
insert into time_slot values ('G', 'M', '16', '0', '16', '50');
insert into time_slot values ('G', 'W', '16', '0', '16', '50');
insert into time_slot values ('G', 'F', '16', '0', '16', '50');
insert into time_slot values ('H', 'W', '10', '0', '12', '30');
-- Prereq
insert into prereq values ('BIO-301', 'BIO-101');
insert into prereq values ('BIO-399', 'BIO-101');
insert into prereq values ('CS-190', 'CS-101');
insert into prereq values ('CS-315', 'CS-101');
insert into prereq values ('CS-319', 'CS-101');
insert into prereq values ('CS-347', 'CS-101');
insert into prereq values ('EE-181', 'PHY-101');
-- Grade_points
insert into grade_points values ('A+', 4.0);
insert into grade_points values ('A', 4.0);
insert into grade_points values ('A-', 3.7);
insert into grade_points values ('B+', 3.3);
insert into grade_points values ('B', 3.0);
insert into grade_points values ('B-', 2.7);
insert into grade_points values ('C+', 2.3);
insert into grade_points values ('C', 2.0);
insert into grade_points values ('C-', 1.7);
insert into grade_points values ('D+', 1.3);
insert into grade_points values ('D', 1.0);
insert into grade_points values ('D-', 0.7);
insert into grade_points values ('F', 0.0);
insert into grade_points values ('NP', 0.0);
insert into grade_points values ('U', 0.0);
Expected Query 1:
Srinivasan 3
Brandt 2
Crick 2
Katz 2
My result of Query 2:
Fall 2009 9
Spring 2009 3
Summer 2009 1
Expected Query 2:
Spring 2010 7
Spring 2009 3
Fall 2009 3
Expected Query 3:
12345 Shankar 4
45678 Levy 3
I cant give you the full solution because you still learning. But here are some guide lines.
First one:
Select name AS InstructorName, count(course_id) AS NumberOfCourses
from Teaches where name IN ....
Teaches doesn't have name, instead of WHERE you need JOIN to Instructor table
Second one:
You dont need filter WHERE year='2009' what they ask is you GROUP BY year, semester if you do GROUP BY semester then all Spring semester will be on the same group
Third one: Same as first. You need JOIN to STUDENTS

Find average number of views before first lead event happens

create table events (
fk_user integer,
event varchar(40),
time integer
);
Insert into events (fk_user, event, time)
VALUES
('1', 'view', '1'),
('1', 'view', '3'),
('1', 'view', '4'),
('1', 'lead', '5'),
('1', 'view', '6'),
('1', 'view', '7'),
('1', 'lead', '9'),
('2', 'view', '1'),
('2', 'lead', '2'),
('2', 'lead', '3'),
('2', 'view', '6'),
('2', 'view', '7'),
('2', 'view', '8'),
('5', 'view', '1'),
('5', 'view', '2'),
('2', 'view', '4'),
('2', 'lead', '5'),
('2', 'view', '9');
What I am trying to find is: There are 3 'views' before a 'lead' occurs from the top. I want to take the average of the 'time' of first three occurrences. Is it possible to do with the window function ?
Expected output should be: (1+3+4)/3 = 2.666 (If taken integer then 3)
You can use the min window function along with a case statement to find the first_lead_time per fkuser and then use a derived table to get the average value of the views rows that become before the first_lead_time
select fkuser, avg(time) from (
select * ,
min (case when event = 'lead' then time end) over (partition by fkuser) as first_lead_time
from events
) t where time < first_lead_time
and event = 'view'
group by fkuser
Another way
select e.fk_user, avg(e.time)
from events e
join (
select min(time) first_lead_time, fkuser
from events
where event = 'lead'
group by fkuser
) t on t.fkuser = e.fkuser
where e.time < t.first_lead_time
group by e.fkuser

Comparing data in same table - SQL

I'm working with a course catalog table in which I have catalog codes and course codes for when the courses were offered. What I need to do is to determine when a course isn't being offered any longer and mark it as an archived course.
CREATE TABLE [dbo].[COURSECATALOG](
[catalog_code] [char](6) NOT NULL,
[course_code] [char](7) NOT NULL,
[title] [char](40) NOT NULL,
[credits] [decimal](7, 4) NULL,
)
insert into coursecatalog
values
('200810', 'BIOL101', 'Biology', '3'),
('200810', 'CHEM201', 'Advanced Chemistry', '3'),
('200810', 'ACCT101', 'Beginning Accounting', '3'),
('201012', 'ACCT101', 'Beginning Accounting', '3'),
('201214', 'ACCT101', 'Beginning Accounting', '3'),
('201214', 'ENGL101', 'English Composition', '3'),
('201416', 'PSYC101', 'Psychology', '3'),
('201618', 'PSYC101', 'Psychology', '3'),
('201618', 'BIOL101', 'Biology', '3'),
('201618', 'CHEM201', 'Advanced Chemistry', '3'),
('201618', 'ENGL101', 'English Composition', '3'),
('201618', 'PSYC101', 'Psychology', '3')
In this case, I need to return ACCT101 - Beginning Accounting since this isn't being offered anymore and should be considered an archived course.
My code so far:
SELECT
catalog_code, course_code
FROM COURSECATALOG t1
WHERE NOT EXISTS (SELECT 1
FROM COURSECATALOG t2
WHERE t2.catalog_code <> t1.catalog_code
AND t2.course_code = t1.course_code)
order by
course_code, catalog_code
But this only returns courses that were only ever offered one time (in one catalog). I need to figure out how I can get courses that might have been offered in multiple catalogs but isn't offered any longer.
Any assistance that can be provide is appreciated!
Thank you!
I think the catalog_code is a date with YYYYMM format
SELECT course_code FROM (
SELECT CONVERT(char, catalog_code,112) AS catalog_code, course_code FROM COURSECATALOG
) AS Q
GROUP BY course_code
HAVING MAX(catalog_code) < '20160101'
Example:
http://sqlfiddle.com/#!6/32adfb/14/1
You want something like this:
SELECT course_code
FROM COURSECATALOG t1
GROUP BY course_code
HAVING MAX(catalog_code) <> '201618';
This assumes that "currently being offered" means that it is in the 201618 catalog.
You could calculate the most recent catalog:
SELECT course_code
FROM COURSECATALOG t1
GROUP BY course_code
HAVING MAX(catalog_code) <> (SELECT MAX(catalog_code FROM COURSECATALOG);