How to select people that don't know anyone who takes workshops - sql

I have a few tables I want to iterate over. First table is Persons:
id
name
address
1
Laura Jansen
New York
2
Sana Vendi
Miami
3
Adam Smith
Boston
4
Mo Zora
Los Angeles
Second one is TakingWorkshop. This is the workshop the people are taking, so person_id is the id of the one in Persons.
id
person_id
workshop_id
20
4
26
19
2
27
18
3
28
Last table is Knows. The person id's are the same as the id's in Persons. So, PersonX knows PersonY.
PersonX
PersonY
1
2
1
3
2
1
4
1
So 1 is Laura, 2 is Sana and 3 is Adam. We can see that Adam doesn't know anyone. That means that Adam automatically also doesn't know anyone who takes workshops, because he doesn't even know anyone. However, in the table we see that Laura, 1, doesn't take workshops. So 4 and 2, Mo and Sana, know Laura, but she doesn't take any workshops so Mo and Sana don't know anyone who takes workshops.
I wrote some code for the people who don't know anyone taking workshops (in this database, it's Adam)
First I do a left join on the Person table and Knows table, on the id of persons and the id of personA of Knows. PersonA knows person B. This join gives me a table of people who know people, including the people who don't know anyone (those are null).
SELECT distinct P.name, K.personA_id
FROM Persons P LEFT JOIN Knows K
ON P.id = K.personA_id
Now I want to see if personB_id is in the person_id of TakingWorkshop. This way you can see whether the known people are taking workshops or not. PersonB_id should NOT be in TakeingWorkshop, because that's how you filter out Laura. I did this like this:
WHERE K.personB_id NOT IN (SELECT person_id
FROM TakingWorkshop)
So my whole code looks like this
SELECT distinct P.name, K.personA_id
FROM Persons P LEFT JOIN Knows K
ON P.id = K.personA_id
WHERE K.personB_id NOT IN (SELECT person_id
FROM TakingWorkshop)
But I get no results when I do this and want to know what's going wrong

Hmmm . . . Your description of the problem suggests not exists. But not exists what?
This query gets everyone who is known and taking a workshop:
select . . .
from knows k join
TakingWorkshop tw
on k.personY = tw.person_id;
So, we can slip that into the query:
select p.*.
from persons p
where not exists (select 1
from knows k join
TakingWorkshop tw
on k.personY = tw.person_id
where k.personX = p.id
);

Related

Counting prefixes in a joined table in SQL

I am trying to count how many makes of car a person owns. Car makes are only defined by a prefix in my Links table.
Table 1 (Person)
UniqueID Name
PER0001 Adrian
PER0002 Michael
Per0003 James
Table 2 (Links)
UniqueID LinkEnd1_ID LinkEnd2_ID
LIN0001 PER0001 FER02332
LIN0002 PER0001 FER02112
LIN0003 PER0001 POR12122
LIN0004 PER0002 FER12321
LIN0005 PER0003 MAS12382
LIN0006 PER0003 FER22982
LIN0006 PER0003 MAS12232
Output (option 1)
Name Car_Make Count
Adrian FER 2
Adrian POR 1
Michael FER 1
James MAS 2
James FER 1
Output (option 2 - preferred)
Name FER POR MAS
Adrian 1 2
Michael 1
James 1 2
The reason I am using a link table to count the number of car makes is because every car make has a different table I would need to join in.
I've tried
select count left(LinkEnd2_ID,3) which doesnt work, i've also tried group by which I cant seem to crack.
I guess what I want to be able to do is
select
count(left(LinkEnd2_ID,3)='FER'
,count(left(LinkEnd2_ID,3)='POR'
,count(left(LinkEnd2_ID,3)='MAS'
but thats a query in a select and I decipher how to code that properly.
Heres where I am starting from (or the base I keep going back to start afresh)-
SELECT
Person.Unique_ID
,Person.Name
,left(Link.LinkEnd2_ID,3) as Car_Make
FROM
Person
LEFT JOIN
Links as Link
on Person.Unique_ID = Link.LinkEnd1_ID
Any help you can offer would be appreciated.
Nearly there, you just need to add a group by, and change all the columns to aggregate functions.
Your option 1:
SELECT
max(Person.Name) as Person_Name
,left(Link.LinkEnd2_ID,3) as Car_Make
,count(*) as No_of_Car
FROM
Person
LEFT JOIN
Links as Link
on Person.Unique_ID = Link.LinkEnd1_ID
GROUP BY
Person.Unique_ID
For your option 2, you need to wrap your aggregate functions around case statements
you have to hardcode the 3 different car make, so if you have unknown number of them, it wouldn't work.
SELECT
max(Person.Name) as Person_Name
,sum(case when left(Link.LinkEnd2_ID,3) ='FER' then 1 else 0 end) as FER
,sum(case when left(Link.LinkEnd2_ID,3) ='POR' then 1 else 0 end) as POR
,sum(case when left(Link.LinkEnd2_ID,3) ='MAS' then 1 else 0 end) as MAS
FROM
Person
LEFT JOIN
Links as Link
on Person.Unique_ID = Link.LinkEnd1_ID
GROUP BY
Person.Unique_ID

How to mix sql consults to make conditions to another one

I've the following tables
series_trailers:
ID EPISODEID CONTENT AUTHOR
-----------------------------
1 122383 url1 Peter
2 9999 url2 Ana
3 923822 stuff Jhon
4 122384 url3 Drake
series_episodes:
ID TITLE SERIESID
--------------------------------
122383 Episode 1 23
9999 Somethingweird 87
923822 Randomtitle 52
122384 Episode 2 23
series:
ID TITLE
-------------------
23 Stranger Things
87 Seriesname
512 Sometrashseries
As you can see there are three tables: one with the series info, one with the series' episodes and another one which contains urls that redirect to the episode's trailers. I'd like to get the lastest rows from series_trailers but without repeating the series where they're from.
I've tried with SELECT DISTINCT EPISODEID FROM series_trailers ORDER BY id DESCbut there are two rows with the same episodes' series so I'll get the seriies Stranger things twice. Summing up I'd like to display the lastest series with new urls but I don't want to get duplicated series (that's what i'd get with the sql above)
EDIT: What I'm supposed to get:
Last updated series:
Stranger Things
Seriesname
Sometrashseries
What I'd get with my sql code:
Stranger Things
Seriesname
Sometrashseries
Stranger Things (again)
If I understood correctly, here is the latest trailer for the latest episodes (latest as in the highest series ID / series_trailer ID, so most likely added lastest).
WITH MostRecentTrailers
AS (
SELECT MAX(st.ID) "TRAILERID"
,s.ID "SERIESID"
,s.TITLE "SERIESTITLE"
FROM series_trailers st
JOIN series_episodes se ON se.ID = st.EPISODEID
JOIN series s ON s.ID = se.SERIESID
GROUP BY s.ID
,s.TITLE
ORDER BY s.ID DESC
)
SELECT *
FROM MostRecentTrailers mrt
JOIN series_trailers st ON st.ID = mrt.TRAILERID
Let me know if that does it for ya.
Edit: Fixed some typo mistakes.
This gives you the trailer with the highest ID for each episode. This answer is based on the assumption that the episode with the highest ID is the latest one.
select id, content from series_trailer where episode_id in
(select max(id)
from series_episodes
group by seriesid)

SQL Newbie Stuck on Manager Flag Field Logic

This is my first month of being a Data Analyst, and I can't seem to find an answer that is specific enough to my problem here to help. I am having trouble getting this manager flag field to work and I think I'm getting confused by the joins.
The goal is to match the EMPLIDs of JOB_VW A to see if they exist in the Supervisor_ID column of Supervisor_VW K. Supervisor_VW K has ALL employees in the company (including supervisors) in the K.EMPLID column. Someone can be in both the supervisor ID column and EMPL column at the same time, but in different rows. The SUP ID is the EMPL ID of someone in a manager position.
For example:
Supervisor_VW K
EMPL ID EMPL NAME SUP ID SUP NAME
1 Smith, John 2 William, Mark
5 Jarvis, John 2 William, Mark
2 William, Mark 4 Rover, Spot
The results I am getting are as such
QUERY RESULTS
EMPLID EMPL NAME MANAGER FLAG
2 William, Mark Y
2 William, Mark Y
4 Rover, Spot Y
1 Smith, John N
5 Jarvis, John N
My current code is as follows:
SELECT CASE WHEN K.Supervisor_Id IS NULL THEN
'N'
ELSE
'Y'
END AS "ManagerFlag".....
FROM (SELECT K.*
FROM SUPERVISOR_VW K, JOB_VW A
WHERE K.SUPERVISOR_ID (+) = A.EMPLID
AND EXISTS (SELECT K1.EMPLID, K1.SUPERVISOR_ID
FROM SUPERVISOR_VW K1
WHERE K1.EMPLID IN K1.Supervisor_Id)
) K
So it seems that I am getting duplicate supervisor rows for every single employee that they supervise. If they supervise only one person, I get a singular row. If they supervise 20, I get 20 duplicate rows of that supervisor. HOWEVER, their employee that they supervise shows up in the table without issue and is properly labeled as N, no duplicates.
If anyone could help, please do! I appreciate you reading through my work, let me know if more info is required.
The usual way to achieve the desired result is like this:
select emplid, emplname,
case when emplid in (select supid from supervisor_vw) then 'Y'
else 'N' end as managerflag
from supervisor_vw;

Multiple JOIN (SQL)

My problem is Play! Framework / JPA specific. But I think it's applicable to general SQL syntax.
Here is a sample query with a simple JOIN:
return Post.find(
"select distinct p from Post p join p.tags as t where t.name = ?", tag
).fetch();
It's simple and works well.
My question is: What if I want to JOIN on more values in the same table?
Example (Doesn't work. It's a pseudo-syntax I created):
return Post.find(
"select distinct p from Post p join p.tags1 as t, p.tags2 as u, p.tags3 as v where t.name = ?, u.name = ?, v.name = ?", tag1, tag2, tag3,
).fetch();
Your programming logic seems okay, but the SQL statement needs some work. Seems you're new to SQL, and as you pointed out, you don't seem to understand what a JOIN is.
You're trying to select data from 4 tables named POST, TAG1, TAG2, and TAG3.
I don't know what's in these tables, and it's hard to give sample SQL statements without that information. So, I'm going to make something up, just for the purposes of discussion. Let's say that table POST has 6 columns, and there's 8 rows of data in it.
P Fname Lname Country Color Headgear
- ----- ----- ------- ----- --------
1 Alex Andrews 1 1 0
2 Bob Barker 2 3 0
3 Chuck Conners 1 5 0
4 Don Duck 3 6 1
5 Ed Edwards 2 4 2
6 Frank Farkle 4 2 1
7 Geoff Good 1 1 0
8 Hank Howard 1 3 0
We'll say that TAG1, TAG2, and TAG3 are lookup tables, with only 2 columns each. Table TAG1 has 4 country codes:
C Name
- -------
1 USA
2 France
3 Germany
4 Spain
Table TAG2 has 6 Color codes:
C Name
- ------
1 Red
2 Orange
3 Yellow
4 Green
5 Blue
6 Violet
Table TAG3 has 4 Headgear codes:
C Name
- -------
0 None
1 Glasses
2 Hat
3 Monacle
Now, when you select data from these 4 tables, for P=6, you're trying to get something like this:
Fname Lname Country Color Headgear
----- ------ ------- ------ -------
Frank Farkle Spain Orange None
First thing, let's look at your WHERE clause:
where t.name = ?, u.name = ?, v.name = ?
Sorry, but using commas like this is a syntax error. Normally you only want to find data where all 3 conditions are true; you do this by using AND:
where t.name=? AND u.name=? AND v.name=?
Second, why are you joining tables together? Because you need more information. Table POST says that Frank's COUNTRY value is 4; table TAG1 says that 4 means Spain. So we need to "join" these tables together.
The ancient (before 1980, I think) way to join tables is to list more than one table name in the FROM clause, separated by commas. This gives us:
SELECT P.FNAME, P.LNAME, T.NAME As Country, U.NAME As Color, V.NAME As Headgear
FROM POST P, TAG1 T, TAG2 U, TAG3 V
The trouble with this query is that you're not telling it WHICH rows you want, or how they relate to each other. So the database generates something called a "Cartesian Product". It's extremely rare that you want a Cartesian Product - normally this is a HUGE MISTAKE. Even though your database only has 22 rows in it, this SELECT statement is going to return 768 rows of data:
Alex Andrews USA Red None
Alex Andrews USA Red Glasses
Alex Andrews USA Red Hat
Alex Andrews USA Red Monacle
Alex Andrews USA Orange None
Alex Andrews USA Orange Glasses
...
Hank Howard Spain Violet Monacle
That's right, it returns every possible combination of data from the 4 tables. Imagine for a second that the POST table eventually grows to 20000 rows, and the three TAG tables have 100 rows each. The whole database would be less than a megabyte, but the Cartesian Product would have 20,000,000,000 rows of data -- probably about 120 GB of data. Any database engine would choke on that.
So if you want to use the Ancient way of specifying tables, it is VERY IMPORTANT to make sure that your WHERE clause shows the relationship between every table you're querying. This makes a lot more sense:
SELECT P.FNAME, P.LNAME, T.NAME As Country, U.NAME As Color, V.NAME As Headgear
FROM POST P, TAG1 T, TAG2 U, TAG3 V
WHERE P.Country=T.C AND P.Color=U.C AND P.Headgear=V.C
This only returns 8 rows of data.
Using the Ancient way, it's easy to accidentally create Cartesian Products, which are almost always bad. So they revised SQL to make it harder to do. That's the JOIN keyword. Now, when you specify additional tables you can specify how they relate at the same time. The New Way is:
SELECT P.FNAME, P.LNAME, T.NAME As Country, U.NAME As Color, V.NAME As Headgear
FROM POST P
INNER JOIN TAG1 T ON P.Country=T.C
INNER JOIN TAG2 U ON P.Color=U.C
INNER JOIN TAG3 V ON P.Headgear=V.C
You can still use a WHERE clause, too.
SELECT P.FNAME, P.LNAME, T.NAME As Country, U.NAME As Color, V.NAME As Headgear
FROM POST P
INNER JOIN TAG1 T ON P.Country=T.C
INNER JOIN TAG2 U ON P.Color=U.C
INNER JOIN TAG3 V ON P.Headgear=V.C
WHERE P.P=?
If you call this and pass in the value 6, you get only one row back:
Fname Lname Country Color Headgear
----- ------ ------- ------ --------
Frank Farkle Spain Orange None
As was mentioned in the comments, you are looking for an ON clause.
SELECT * FROM TEST1
INNER JOIN TEST2 ON TEST1.A = TEST2.A AND TEST1.B = TEST2.B ...
See example usage of join here:
http://en.wikibooks.org/wiki/Java_Persistence/Relationships#Join_Fetching

Using different columns values twice in a single SQL query?

I have a mySQL table called "User" containing multiple mixed values as this:
[user_id] [user_email] [birthday]
---------------------------------
1 x#xxx.com 01/01/1981
2 y#yyy.com 02/02/1982
3 z#zzz.com 03/03/1983
I have another table called "Name" which contains name of the user, but also of some movies like this:
[node_id] [name] [user_id]
----------------------------------
9 John Doe 1
10 Star Wars 90
11 Mike Smith 2
12 Mary Lord 3
13 Rocky III 91
Finally, I have a third table named "Vote" with which is a relationship between a user and some movies he likes.
[vote_id] [node_id] [user_id]
------------------------------
1 10 1
2 10 2
3 13 1
12 10 3
13 13 2
What I'm struggling to do is pull a query with twice the "name" value for two separate things: the name of the user, and the name of the movie he likes. Like this:
[user_id] [user_name] [Birthday] [movie_name]
-------------------------------------------------
1 John Doe 01/01/1981 Star Wars
2 Mike Smith 02/02/1982 Star Wars
1 John Doe 01/01/1981 Rocky III
3 Mary Lord 03/03/1983 Rocky III
2 Mike Smith 02/02/1982 Rocky III
SELECT user.id,
node.name,
user.birthday,
IF(node.type = "movie", node.name, "")
FROM user,
node
JOIN vote ON vote.user_id = user.user_id
WHERE user.id = node.id
I think I'm all mixed up... anyone can help please?
Assuming your schema is exactly what you posted above this should work verbatim.
Query
SELECT user.user_id,
node.name user_name,
user.birthday,
(select node.name from node where node_id = vote.node_id) as movie_name
FROM user
JOIN node ON user.user_id = node.user_id
JOIN vote ON vote.user_id = user.user_id
Result
You have got the database structure wrong. Store the user name in your first table "User"
I would strongly suggest that you store the user_name in the users table. With that change you can then have a much more simple query and a properly normalized schema.
New proposed schema.
users table
(Added user_name column)
[user_id][name][user_email][birthday]
1 name1 x#xxx.com 01/01/1981
2 name2 y#yyy.com 02/02/1982
3 name3 z#zzz.com 03/03/1983
nodes table (call this movies)
(removed user entries and the user_id column as you'll be using votes to link these to users)
[node_id] [name]
10 Star Wars
11 Mike Smith
12 Mary Lord
13 Rocky III
votes table (call this something like movies_users)
(removed the vote_id column as it's just a join table)
[node_id] [user_id]
10 1
10 2
13 1
10 3
13 2
Then your query should look something like this:
select users.user_id, users.name, users.birthday, nodes.name as movie_name
from users
join votes on users.id = votes.user_id
join nodes on votes.node_id = nodes.node_id
select user_id,user_name,birthday,name
from user,name,vote
where (and here you do all the joins like user_id from one table equals user_id from another table)
But here we have a problem which makes me impossible to understand how to write the correct code you have 2 fields in two different tables, user_name and name, you want to join the tables by this name? I don't understand.) I think you are mixing the movie names with the user names, reformulate the question please
I agree with the other answers that you would be better off if you moved the user name into the user table. However, if you are stuck with your current table structure, try this:
SELECT user.id,
uname.name user_name,
user.birthday,
movie.name movie_name
FROM user
JOIN node uname ON uname.user_id = user.user_id
JOIN vote ON vote.user_id = user.user_id
JOIN node movie ON vote.node_id = movie.id
(Assuming votes can only be cast for Movies, it should be unnecessary to blank out non-movies as these should never exist.)