How should I write the JOIN clause to make it work well? - sql

I'm doing Q4 (Find the titles of all movies not reviewed by Chris Jackson. ) from SQL Movie-Rating Query Exercises Extras and I don't know why this code doesn't work:
SELECT DISTINCT movie.title
FROM movie
INNER JOIN rating ON movie.mid = rating.mID
INNER JOIN reviewer ON rating.rid = reviewer.rid
WHERE rating.mid NOT IN (SELECT rating.mid FROM rating WHERE rating.rid = (SELECT reviewer.rid FROM reviewer WHERE reviewer.name = 'Chris Jackson') )
Output:
title
Gone with the Wind
Snow White
Avatar
This output doesn't include movies that ARE in movie table but ARE NOT in rating table. So I suspect maybe this has something to do with JOIN clause.
TABLES:
Movie
mID title year director
101 Gone with the Wind 1939 Victor Fleming
102 Star Wars 1977 George Lucas
103 The Sound of Music 1965 Robert Wise
104 E.T. 1982 Steven Spielberg
105 Titanic 1997 James Cameron
106 Snow White 1937 <null>
107 Avatar 2009 James Cameron
108 Raiders of the Lost Ark 1981 Steven Spielberg
Reviewer
rID name
201 Sarah Martinez
202 Daniel Lewis
203 Brittany Harris
204 Mike Anderson
205 Chris Jackson
206 Elizabeth Thomas
207 James Cameron
208 Ashley White
Rating
rID mID stars ratingDate
201 101 2 2011-01-22
201 101 4 2011-01-27
202 106 4 <null>
203 103 2 2011-01-20
203 108 4 2011-01-12
203 108 2 2011-01-30
204 101 3 2011-01-09
205 103 3 2011-01-27
205 104 2 2011-01-22
205 108 4 <null>
206 107 3 2011-01-15
206 106 5 2011-01-19
207 107 5 2011-01-20
208 104 3 2011-01-02

Firstly you must use LEFT JOIN and then then use GROUP BY movie.mid, movie.title and put the condition in the HAVING clause:
SELECT movie.title
FROM movie
LEFT JOIN rating ON movie.mid = rating.mID
LEFT JOIN reviewer ON rating.rid = reviewer.rid
GROUP BY movie.mid, movie.title
HAVING SUM(CASE WHEN reviewer.name = 'Chris Jackson' THEN 1 ELSE 0 END) = 0
See the demo.
Results:
> | title |
> | :----------------- |
> | Avatar |
> | Gone with the Wind |
> | Snow White |
> | Star Wars |
> | Titanic |

Although you can use aggregation, I would recommend NOT EXISTS for this purpose. It is close to the phrasing of the logic you want:
SELECT m.*
FROM movie m
WHERE NOT EXISTS (SELECT 1
FROM rating r JOIN
reviewer re
ON r.rid = re.rid
WHERE m.mid = r.mID AND
re.name = 'Chris Jackson'
);
There should simply be no comparison from a performance perspective. This should have much better performance.

Related

BD2: SQL _CASE with group by

I have the following tables
---SALARY_ITEMS---
PERSONID | EMPLOYMENT _REF | GROUP1 | CODE | FROM | END | QUANTI
000101 XYX 400 11101 2020-02-12 2020-02-12 12
000101 XYX 300 1100 2020-01-29 2020-02-29 1
000102 XYY 450 11111 2020-02-01 2020-02-12 19
000102 XYY 400 11101 2020-02-02 2020-02-12 82
000103 XYA 500 1100 2020-02-10 2020-02-12 11
000104 XYB 700 1100 2020-01-12 2020-02-12 24
---PERSON ---
PERSONID NAME
000101 Carolina
000102 Helen
000103 Jack
000104 Anna
---EMPLOYMENT---
PERSONID EMPLOYMENT _REF POSITION
000101 XYX doctor
000102 XYY nurse
000103 XYA nurse
000104 XYB Proffesor
----absent---
PERSONID CODE2 FROM END
000101 123 2020-03-01 2020-06-30
000102 120 2020-02-05 2020-02-13
000102 123 2020-03-01 2020-03-28
000103 115 2020-05-05 2020-06-30
000104 123 2020-02-01 2020-05-30
What I tried to do: get all employee that they are doctor and nurse and have certain group with certain code and works over 100 hours in a 2020 -Feb.
The following SQL query give me what i want But i want to add something to my query that is :
create a new column to see if the employee was absent in the same period 2020-feb with absent code 120 or 119 or both.
If he was I will get the 'CODE2' ELSE 'NOTHING'.
How can I do this in DB2?
This is the result I need to get:
PERSONID | NAME | POSITION | QUANTITY |ABSENT (this what i want to have)
000102 Helen NURSE 101 120
Query:
SELECT
S.PERSONID, P.NAME,E.POSTION , sum(S.QUANTITY) as QUANTITY
FROM
SALARY_ITEMS S
LEFT JOIN
PERSON P ON S.PERSONID = P.PERSONID
LEFT JOIN
EMPLOYMENT E ON E.EMPLOYMENT_REF = S.EMPLOYMENT _REF
WHERE
S.group1 IN ('400', '440', '450', '470', '640')
AND S.code IN ('11101', '11111', '11121', '11131', '11141')
AND S.from >= '2020-02-01'
AND S.end <= '2020-02-29'
AND E.POSTION IN ('nurse', 'doctor')
AND (SELECT SUM(S2.QUANTITY) AS QUANTITY2
FROM SALARY_ITEMS S2
WHERE S2.group1 IN ('400', '440', '450', '470', '640')
AND S2.code IN ('11101', '11111', '11121', '11131', '11141')
AND S2.from >= '2020-02-01'
AND S2.end <= '2020-02-29'
AND S.PERSONID = S2.PERSONID) >= '100'
GROUP BY
S.PERSONID, P.NAME, E.POSTION

SQL join to identify group members

I have got a client table which pretty much looks like below:
Client List
customer no. Customer name
123 Kristen Smith
128 Jeremy Church
127 Alan Li
132 Ryan Nelson
I need to map it to a Customer_Dim table
Customer_Dim
customer no. Customer name Group no. Group Name Cust_Active Flag
123 Kristen Smith 5491 Zealong Tea Estate Y
167 Anna Hathaway 5823 AA Insurance Y
146 Simon Joe 5671 Direct Automobile Y
148 Henry Wilson 5823 AA Insurance Y
195 Graham Brown 5491 Zealong Tea Estate Y
172 Daria Smith 5671 Direct Automobile N
122 Dyana Smith 5823 AA Insurance N
132 Ryan Nelson 5671 Direct Automobile N
128 Jeremy Church 5823 AA Insurance Y
127 Alan Li 5671 Direct Automobile Y
to get their group numbers from below table (which I am able to do by a simple left join)
to list all the remaining customers (who are active) from the group numbers of the client customer [I AM UNABLE TO DO THIS 2nd PART] :
Required Results :
Customer No. Customer name Group No. Group Name
123 Kristen Smith 5491 Zealong Tea Estate
128 Jeremy Church 5823 AA Insurance
127 Alan Li 5671 Direct Automobile
195 Graham Brown 5491 Zealong Tea Estate
167 Anna Hathaway 5823 AA Insurance
148 Henry Wilson 5823 AA Insurance
146 Simon Joe 5671 Direct Automobile
Please let me know if any other information is needed.
Sorry, if a similar question has been asked earlier - did several searches but was unable to find anything.
Thanks
join the tables to get all the group numbers of the clients in the client list and then select from customer_dim only the clients of these group numbers who are active:
select * from customer_dim
where
cust_active_flag = 'Y'
and
groupno in (
select groupno
from client_list l inner join customer_dim d
on d.customerno = l.customerno
)
See the demo.
Results:
> customerno | customername | groupno | groupname | cust_active_flag
> ---------: | :------------ | ------: | :----------------- | :---------------
> 123 | Kristen Smith | 5491 | Zealong Tea Estate | Y
> 167 | Anna Hathaway | 5823 | AA Insurance | Y
> 146 | Simon Joe | 5671 | Direct Automobile | Y
> 148 | Henry Wilson | 5823 | AA Insurance | Y
> 195 | Graham Brown | 5491 | Zealong Tea Estate | Y
> 128 | Jeremy Church | 5823 | AA Insurance | Y
> 127 | Alan Li | 5671 | Direct Automobile | Y
for get required results you need a condition in your join
SELECT *
FROM Client c
JOIN Customer_Dim cd on c.CustomerNo = cd.CustomerNo
and cd.Cust_ActiveFlag ='Y'
or
SELECT *
FROM Client c
JOIN Customer_Dim cd on c.CustomerNo = cd.CustomerNo
WHERE cd.Cust_ActiveFlag ='Y'
I think it is pretty simple to get your posted result from Customer_Dim table.
if you don't want Group No. of ClientList
select * from Customer_Dim
where [Cust_Active Flag] = 'Y'
and [Group No.] not in (
select CD.[Group No.] from [Client List] as CL inner join Customer_Dim as CD where CL.[customer no.] = CD.[customer no.] )
And
if you only want Group No. of ClientList
select * from Customer_Dim
where [Cust_Active Flag] = 'Y'
and [Group No.] in (
select CD.[Group No.] from [Client List] as CL inner join Customer_Dim as CD where CL.[customer no.] = CD.[customer no.] )
For the case of a client being inactive and wanting to identify the client in the result set you can do a LEFT JOIN with a GROUP BY and leverage Proc SQL automatic remerging in a HAVING clause for selection criteria.
data client_list; input
custno custname:& $30.; datalines;
123 Kristen Smith
128 Jeremy Church
127 Alan Li
132 Ryan Nelson
899 Julius Caesar
run;
data customer_dim; input
custno custname:& $30. groupnum groupname:& $30. Cust_Active_Flag: $1.; datalines;
123 Kristen Smith 5491 Zealong Tea Estate Y
167 Anna Hathaway 5823 AA Insurance Y
146 Simon Joe 5671 Direct Automobile Y
148 Henry Wilson 5823 AA Insurance Y
195 Graham Brown 5491 Zealong Tea Estate Y
172 Daria Smith 5671 Direct Automobile N
122 Dyana Smith 5823 AA Insurance N
132 Ryan Nelson 5671 Direct Automobile N
128 Jeremy Church 5823 AA Insurance Y
127 Alan Li 5671 Direct Automobile Y
231 Donald Duck 7434 Orange Insurance Y
899 Julius Caesar 4999 Emperors N
900 Joshua Norton 4999 Emperors N
925 Joaquin Guzman 4999 Emperors Y
925 Naruhito 4999 Emperors Y
run;
proc sql;
create table want(label="Active customers of clients groups") as
select
LIST.custno as client,
DIM.*
from
customer_dim DIM
left join
client_list LIST
on
DIM.custno = LIST.custno
group by
groupnum
having
N(LIST.custno) > 0
and
(
cust_active_flag = 'Y'
or LIST.custno is not NULL
)
order by
groupnum, custno
;

Max from Query from Select data

I am pretty new to SQL and need some help with a query. I am trying the find the MAX TradeCodeID using the following query. It is not returning the data I need. It is pretty much returning t.
select distinct
t.useremployeeid,
max(t.usertradeID),
t.Projectfullname,
t.userfirstname + ' '+ t.userlastname as GreatestPM
from
(select distinct
users.UserTradeId, UserEmployeeID, UserFirstName, UserLastName,
ProjectFullName, ProjectManager,
max(ScheduleDate) as LastDate
from
schedules
left outer join
users on ScheduleUserID = UserID
left outer join
Phases on SchedulePhaseID = PhaseID
left outer join
Projects on phases.ProjectID = projects.ProjectID
left outer join
UserTrades on UserTrades.UserTradeID = Users.UserTradeID
where
users.useractive = 1
and users.useremployeeid <> 0
and users.usertradeid between 21 and 24
and projectfullname is not null
group by
users.UserTradeid, UserEmployeeID, UserFirstName, UserLastName,
ProjectFullName, ProjectManager
having
max(scheduledate) > getdate() ) t
group by
t.projectfullname, t.userfirstname,t.userlastname, UserEmployeeID
order by
t.projectfullname
From the following data set:
useremployeeid UserTradeID Projectfullname GreatestPM
--------------------------------------------------------------------------------
12121 22 162331.05 John Smith
25487 21 166324.1 Chuck Norris
45639 21 166324.1 Brad Pitt
35789 23 166324.1 John Doe
15697 24 166324.1 Matt Damon
28957 23 166324.1 Taylor Swift
76985 21 166324.1 Tony Romo
25496 21 166324.1 George Strait
85695 22 167091.1 Robin Roberts
75632 21 167091.1 Scott Smith
66897 22 1663341.01 Garth Brooks
58766 21 1663341.01 Travis Tritt
37895 21 1663341.01 Sara Roberts
95687 21 1663352.01 Justin Timberlake
85697 24 1663352.01 Sally Walker
I am looking to get the following results:
useremployeeid UserTradeID Projectfullname GreatestPM
----------------------------------------------------------
12121 22 162331.05 John Smith
15697 24 166324.1 Matt Damon
85695 22 167091.1 Robin Roberts
66897 22 1663341.01 Garth Brooks
85697 24 1663352.01 Sally Walker
Thank you for the help.

Copy rows of data in SQL Server

Please help me come up with a solution for the situation being explained below:
ID name address age hobby GPA
---------------------------------------------------------
101 James 100 Garfield St 21 reading 3.13
101 James 100 Garfield St 21 writing 2.63
101 James 100 Garfield St 21 running 3.81
109 Tom 19 Lily Ave 19 dating 3.54
109 Tom 20 Lily Ave 19 climbing 2.76
109 Tom 21 Lily Ave 19 watching 3.91
I want to copy the set of rows with the same ID (eg. 101) and assign each set a State abbreviation(s) by running a single sql query. For instance: adding states CA, NJ, and DE to rows with an ID of 101, the result set is expected to look like this:
ID name address age hobby GPA state
-----------------------------------------------------------------------
101 James 100 Garfield St 21 reading 3.13 CA
101 James 100 Garfield St 21 writing 2.63 CA
101 James 100 Garfield St 21 running 3.81 CA
101 James 100 Garfield St 21 reading 3.13 NJ
101 James 100 Garfield St 21 writing 2.63 NJ
101 James 100 Garfield St 21 running 3.81 NJ
101 James 100 Garfield St 21 reading 3.13 DE
101 James 100 Garfield St 21 writing 2.63 DE
101 James 100 Garfield St 21 running 3.81 DE
Please keep in mind that everything else remains the same way as they were before the addition of the state abbreviations. Also assume I have more than three states to add and integrate to the query, say, I have all 50 states. Thank you for your time and effort in advance!
This should produce that result set:
select x.*, y.st
from tbl x
join
(select 'CA' as st union all
select 'NJ' union all
select 'DE') y
where x.id = 101
Create a new table with IDs and States
ID ST
101 CA
101 NJ
101 DE
109 ..
then join that on your table
SELECT t.*, s.st
FROM tbl t
JOIN states s ON t.id = s.id

How I select record that not appear in another table

Table: Movie
mID title year director
101 Gone with the Wind 1939 Victor Fleming
102 Star Wars 1977 George Lucas
103 The Sound of Music 1965 Robert Wise
104 E.T. 1982 Steven Spielberg
105 Titanic 1997 James Cameron
106 Snow White 1937 <null>
107 Avatar 2009 James Cameron
108 Raiders of the Lost Ark 1981 Steven Spielberg
Table: Rating
rID mID stars ratingDate
201 101 2 2011-01-22
201 101 4 2011-01-27
202 106 4 <null>
203 103 2 2011-01-20
203 108 4 2011-01-12
203 108 2 2011-01-30
204 101 3 2011-01-09
205 103 3 2011-01-27
205 104 2 2011-01-22
205 108 4 <null>
206 107 3 2011-01-15
206 106 5 2011-01-19
207 107 5 2011-01-20
208 104 3 2011-01-02
I need to fetch movies which are not rate yet. In this case Titanic (mID 105) and Star Wars (mID 102) never get rate in rating table.
I figured out it with
select distinct movie.title from movie,rating where
rating.mid!=movie.mid except select distinct movie.title from
movie,rating where rating.mid=movie.mid
however I think it might have better (easier/cleaner) way to do.
Simple:
SELECT Movies.* FROM Movies LEFT JOIN Rating ON Movies.mID = Rating.mID WHERE Rating.mID IS NULL
If I understood your question properly, that looks like textbook application of outer joins.
You could do it like this:
SELECT * FROM Movie WHERE mid NOT IN (SELECT DISTINCT(mid) FROM Rating)
Basically it will select all records from the movie table that are not in the rating table, linking them on the 'mid' column, which I am assuming is a unique identifier.
I will add another possibility.
Select [list columns here]
from Movie m
where NOT exists (SELECT * FROM RATING r where m.mid = r.mid)