SQL query to find rows with at least one of the specified values - sql

Suppose you had two tables. One called MOVIES:
MovieId
MovieName
Then another called ACTORS that contains people who appear in those movies:
MovieId
ActorName
Now, I want to write a query that returns any movie that contains ONE OR MORE of the following actors: "Tom Hanks", "Russell Crowe" or "Arnold Schwarzenegger".
One way to do it would be something like:
SELECT DISTINCT A.MovieId, M.MovieName FROM ACTORS A
INNER JOIN MOVIES M USING (MovieId)
WHERE A.ActorName IN ('Tom Hanks', 'Russell Crowe', 'Arnold Schwarzenegger');
Which is perfectly fine, however in my case I might have several more of these conditions on the WHERE clause so I want to find a way to make the MOVIES table the primary table I select from.
What's the best way to query for this? I'm using Oracle 11g if that matters, but I'm hoping for a standard SQL method.

You can use EXISTS or IN subqueries:
SELECT *
FROM MOVIES m
WHERE EXISTS
(
SELECT *
FROM ACTORS a
WHERE a.MovieId = m.MovieId
AND a.ActorName IN ('Tom Hanks', 'Russell Crowe', 'Arnold Schwarzenegger')
)
or
SELECT *
FROM MOVIES m
WHERE m.MovieId IN
(
SELECT a.MovieId
FROM ACTORS a
WHERE a.ActorName IN ('Tom Hanks', 'Russell Crowe', 'Arnold Schwarzenegger')
)

First you should have a 3rd table implementing the n:m relationship:
CREATE TABLE movie (
movie_id int primary key
,moviename text
-- more fields
);
CREATE TABLE actor (
actor_id int primary key
,actorname text
-- more fields
);
CREATE TABLE movieactor (
movie_id int references movie(movie_id)
,actor_id int references actor(actor_id)
,CONSTRAINT movieactor_pkey PRIMARY KEY (movie_id, actor_id)
);
Then you select like this:
SELECT DISTINCT m.movie_id, m.moviename
FROM movie m
JOIN movieactor ma USING (movie_id)
JOIN actor a USING (actor_id)
WHERE a.actorname IN ('Tom Hanks', 'Russell Crowe', 'Arnold Schwarzenegger');
Note, that text literals are enclose in single quotes!

Related

Shorter way to write this SQL query with a 3-way join and a condition

I'm currently learning SQL in the Harvard CS50 online course. The assignment is to write various SQL queries for a database. Here is a link to the assignment. I'm talking about the 12th query there.
The schema of the database looks like this:
CREATE TABLE movies (
id INTEGER,
title TEXT NOT NULL,
year NUMERIC,
PRIMARY KEY(id)
);
CREATE TABLE stars (
movie_id INTEGER NOT NULL,
person_id INTEGER NOT NULL,
FOREIGN KEY(movie_id) REFERENCES movies(id),
FOREIGN KEY(person_id) REFERENCES people(id)
);
CREATE TABLE directors (
movie_id INTEGER NOT NULL,
person_id INTEGER NOT NULL,
FOREIGN KEY(movie_id) REFERENCES movies(id),
FOREIGN KEY(person_id) REFERENCES people(id)
);
CREATE TABLE ratings (
movie_id INTEGER NOT NULL,
rating REAL NOT NULL,
votes INTEGER NOT NULL,
FOREIGN KEY(movie_id) REFERENCES movies(id)
);
CREATE TABLE people (
id INTEGER,
name TEXT NOT NULL,
birth NUMERIC,
PRIMARY KEY(id)
);
The goal is to write a SQL query that returns the titles of all movies in which both Johnny Depp and Helena Bonham Carter starred. The query I came up with returns the list of movies for each actor and then uses INTERSECT on both of these lists. This is the query:
SELECT
movies.title
FROM
movies
JOIN stars ON movies.id = stars .movie_id
JOIN people ON stars .person_id = people.id
WHERE
people.name = "Johnny Depp"
INTERSECT
SELECT
movies.title
FROM
movies
JOIN stars ON movies.id = stars .movie_id
JOIN people ON stars .person_id = people.id
WHERE
people.name = "Helena Bonham Carter";
The query returns the correct results, however I feel it isn't very elegant or fast. Is there a shorter, more elegant and/or faster way to write this?
You could do something like this:
with cte as (
select
movie_id,count(p.id) as cnt
from stars s
join people p on p.id = s.person_id
where p.name in (
"Johnny Depp","Helena Bonham Carter"
)
group by movie_id
)
select title
from movies
where id in (
select movie_id from cte where cnt = 2
)
Basically the cte gets the movie ids where the two people are present in the stars table. Then we count that to see where both people are present in the stars table and join it to the movies table to get the actual name. You could probably use 'having' instead also.

Return everything from many-to-many relationship with only one query

I'll give an example to better clarify what I want:
Suppose I have the following classes in my programming language:
Class Person(
int id,
string name,
List<Car> cars
);
Class Car(
int id,
string name,
string brand
)
I want to save that in a PostgreSQL database, so I'll have the following tables:
CREATE TABLE person(
id SERIAL,
name TEXT
);
CREATE TABLE car(
id SERIAL,
name TEXT,
brand TEXT
)
CREATE TABLE person_car(
person_id int,
car_id int,
CONSTRAINT fk_person
FOREIGN KEY (person_id)
REFERENCES person(id),
CONSTRAINT fk_car
FOREIGN KEY (car_id)
REFERENCES car(id)
)
Then, I want to select all people with their cars from DB. I can select all people, then for each person, select their cars. But supposing I have 1000 people, I will have to query the DB 1001 times (one to select all people, and one for each person, to get their cars).
Is there an efficient way to bring all people, each with all their cars in a single query, so that I can fill my classes with the correct data without querying the DB a lot of times?
If you want to return a hierarchical dataset, you can use subqueries with COALESCE, for example :
SELECT
p.id
p.name,
COALESCE((SELECT
json_agg(json_build_object(
'id', c.id,
'name', c.name,
'brand', c.brand
))
FROM car AS c
JOIN person_car pc ON c.id = pc.car_id
WHERE pc.person_id = p.id), '[]'::json) AS cars
FROM person AS p;
You are joining person and car to person_car based on their respective ID’s.
SELECT
person.name,
person.id as person_id,
car.name,
car.brand,
car.id as car_id
FROM
person
JOIN
person_car
ON
person.id = person_car.person_id
JOIN
car
ON
car.id = person_car.car_id

How to Select Max Value From a View PosgreSQL

I am new at SQL and I am trying to select a max value from a view. The database is of movies and actors, and the nested query part works. I am trying to find the actor that has the most co-actors, so the first thing I did was calculate the number of co-actors for each actor. Now I would like to select the value with the highest co-actors and return the number and the name of the actor. Please find below the attempted code:
CREATE VIEW actorview AS
SELECT COUNT(DISTINCT A2.name) AS Count, A.name AS Name
FROM actors A, actors A2
WHERE A2.mid =A.mid
GROUP BY A.name;
SELECT Name, MAX(Count) FROM actorview;
actors table
CREATE TABLE actors (mid integer NOT NULL, name varchar, cast_position integer, PRIMARY KEY (mid, name),
FOREIGN KEY (mid) REFERENCES movies(mid) ON DELETE CASCADE ON UPDATE CASCADE);
Edit:
In the above table mid (movie ID) represents the Movies an actor has been in, any actor that shares the same mid as another was in a movie with that actor. The view works for finding the number of co-actors every actor has, now I just need to select from that list the actor that has the most co-actors.
You can use analytical function rank as follows:
Select * from
(Select a.name, count(distinct a2.name) as co_actors,
Rank() over (order by count(distinct a2.name) desc ) as rn
From actors a join actors c
On a.mid = a2.mid and a.name <> a2.name
Group by a.name) t
Where rn = 1

SQL select the actor and movie name from every actor that has worked with a particular actor

This is a HW assignment, so please no exact answers if you can help it; I want to learn, not have it done for me.
(The create table statments are at the end of this post)
My task is to find all of the actors who have been in a movie with Tom Hanks, ordered by movie title, using 2 queries.
So far I have been able to create the following query; I know that my join is wrong, but I'm not sure why. How can I think about this differently? I feel like I'm close to the answer, but not quite there.
SELECT actor.name, movie.title FROM actor
LEFT OUTER JOIN character.movie_id ON movie.id IN
(
-- Get the ID of every movie Tom Hanks was in
SELECT movie_id FROM actor
INNER JOIN character ON character.actor_id = actor.id
WHERE actor.name = 'Tom Hanks'
)
WHERE actor.name != 'Tom Hanks'
ORDER BY movie.title;
Here are the create table statments for the schema:
create table actor (
id varchar(100),name varchar(100),
constraint pk_actor_id primary key (id));
create table movie(
id varchar(100),
title varchar(100),
year smallint unsigned,
mpaa_rating varchar(10),
audience_score smallint unsigned,
critics_score smallint unsigned,
constraint pk_id primary key(id));
create table character(
actor_id varchar(100),
movie_id varchar(100),
character varchar(100),
constraint pk_character_id primary key(movie_id, actor_id, character),
constraint fk_actor_id foreign key (actor_id) references actor (id),
constraint fk_movie_id foreign key (movie_id) references movie (id));
A left outer join will give you every entry in the left table (actor). If it has a corresponding value in the right table, it will give you that value, otherwise, you will get a null value returned.
Additionally, you join a table. In your query, you are trying to join a column
Something like this, perhaps? FWTH is films with Tom Hanks.
select mo.title, ac.name
from character ch join
(select m.movie_id
from character c join movie m on c.movie_id = m.id
join actor a on a.id = c.actor_id
where a.name = 'Tom Hanks'
) fwth on ch.movie_id = fwth.movie_id
join actor ac on ac.id = ch.actor_id
join movie mo on mo.id = fwth.movie_id
order by mo.title;

SQL cross-reference table self-reference

I am working on a project where I have a table of
all_names(
team_name TEXT,
member_name TEXT,
member_start INT,
member_end INT);
What I have been tasked with is creating a table of
participants(
ID SERIAL PRIMARY KEY,
type TEXT,
name TEXT);
which contains all team and member names as their own entries. Type may be either "team" or "member".
To compliment this table of participants I am trying to create a cross-reference table that allows a member to be referenced to a team by ID and vice versa. My table looks like this:
belongs_to(
member_id INT REFERENCES participants(ID),
group_id INT REFERENCES participants(ID),
begin_year INT,
end_year INT,
PRIMARY KEY (member_id, group_id);
I am unsure of how to proceed and populate the table properly.
The select query I have so far is:
SELECT DISTINCT ON (member_name, team_name)
id, member_name, team_name, member_begin_year, member_end_year
FROM all_names
INNER JOIN artists ON all_names.member_name = participants.name;
but I am unsure of how to proceed. What is the proper way to populate the cross reference table?
Probably the easiest solution is to use a few statements. Wrap this is a transaction to make sure you don't get concurrency issues:
BEGIN;
INSERT INTO participants (type, name)
SELECT DISTINCT 'team', team_name
FROM all_names
UNION
SELECT DISTINCT 'member', member_name
FROM all_names;
INSERT INTO belongs_to
SELECT m.id, g.id, a.member_start, a.member_end
FROM all_names a
JOIN participants m ON m.name = a.member_name
JOIN participants g ON g.name = a.team_name;
COMMIT;
Members that are part of multiple teams get all of their memberships recorded.