Where to store cross referenced SQL values? - sql

I'm building a movie database with two tables, one containing movie names and the other containing actor names. How do I cross reference the data?
For example, should entries in the movie names table contain a list of actors for each movie, or should entries in the actor names table contain a list of movies for actor?

You will need an additional table which links the two, and contains at least the movie id and actor id as its columns. Assuming your Actors table has actor_id as its primary key, and Movies has movie_id as its primary key. This table could also include information about the actor specific to that movie, for example, the character's name or other character-related info.
CREATE TABLE actors_in_movies (
/* Use whatever auto increment keyword is needed for your RDBMS */
id INT NOT NULL PRIMARY KEY <auto increment>,
actor_id INT,
movie_id INT,
character_name VARCHAR(),
other_character_info VARCHAR(),
character_best_quote TEXT,
FOREIGN KEY (actor_id) REFERENCES actors (actor_id),
FOREIGN KEY (movie_id) REFERENCES actors (movies_id)
);
To query all the movies an actor appears in,
use something to the effect of:
SELECT
actors.*,
movies.name
FROM
actors
JOIN actors_in_movies ON actors.actor_id = actors_in_movies.actor_id
JOIN movies ON movies.movie_id = actors_in_movies.movie_id
WHERE actors.actor_id = <some actor id>
To query all the actors in a particular movie,
use something to the effect of:
SELECT
actors.*
FROM
actors
JOIN actors_in_movies ON actors.actor_id = actors_in_movies.actor_id
JOIN movies ON movies.movie_id = actors_in_movies.movie_id
WHERE movies.movie_id = <some movie id>
To add an actor to a movie, insert a row into your actors_in_movies table:
INSERT INTO actors_in_movies (actor_id, movie_id) VALUES (..., ...);

Use a cross table! Have a table which has only two columns: movieID, actorID.

You should have a third table, one called ActorInMovie. This table should have 2 columns. ActorID, MovieID.
This way, you'll be able to have a many to many rel.

Related

Sql: Check if a column is used in another table

I have the following scenario: I have two table, genres and movies. Genres has id and title column, Movies table has id,title,release date,price,rating and genreId as FK.
The FK is reading the Id from the genre table.
My use case is the following. If I create a new genre, Action for example, and afterwards create two new movies with this genre, I shouldn't be able to delete the genre since it's used by two movies.
I need some help with the SQL syntax.

SQLite : how to achieve ORM "eager loading" in plain SQL?

I have a SQLite database with Movies, Actors, and Tags.
There is a many-to-many relation between movies and actors, and movies and tags.
In my app, I want to list all movies with their corresponding actors and tags, for example:
Mr. & Mrs. Smith: Actors: Brad Pitt, Angelina Jolie, Tags: Action, Comedy, Crime
Passengers: Actors: Jennifer Lawrence, Chris Pratt, Tags: Adventure, Drama, Romance
And I'm wondering what are the correct SQL statements to achieve that.
The tables in my database are defined as follows :
CREATE TABLE "Movie"
(
id INTEGER PRIMARY KEY,
name VARCHAR
)
CREATE TABLE "Actor"
(
id INTEGER PRIMARY KEY,
name VARCHAR
)
CREATE TABLE "Tag"
(
id INTEGER PRIMARY KEY,
name VARCHAR
)
CREATE TABLE "Movie_Actor"
(
movie_id INTEGER,
actor_id INTEGER,
FOREIGN KEY(movie_id) REFERENCES "Movie" (id),
FOREIGN KEY(actor_id) REFERENCES "Actor" (id),
UNIQUE(movie_id, actor_id)
);
CREATE TABLE "Movie_Tag"
(
movie_id INTEGER,
tag_id INTEGER,
FOREIGN KEY(movie_id) REFERENCES "Movie" (id),
FOREIGN KEY(tag_id) REFERENCES "Tag" (id),
UNIQUE(movie_id, tag_id)
);
To get a single movie with it's actors and tags I use the following 3 queries (for example Movie.id = 1):
To get the movie row:
SELECT *
FROM Movie
WHERE id = 1
To get the actors:
SELECT *
FROM
(SELECT * FROM Actor) AS T1
JOIN
(SELECT * FROM Movie_Actor WHERE Movie_Actor.movie_id = 1) AS T2 ON T1.id = T2.actor_id
To get tags:
SELECT *
FROM
(SELECT * FROM Tag) AS T1
JOIN
(SELECT * FROM Movie_Tag WHERE Movie_Tag.movie_id = 1) AS T2 ON T1.id = T2.tag_id
My question is, how should I go about retrieving the tags and actors when I'm getting a list of movies such as SELECT * FROM Movie?
Many ORMs have an option to 'eager load' relations, and I'm wondering how can I do it in plain SQL?
Do I need to execute extra 2 queries on each row I get from SELECT * FROM Movie?
Thank You.
To get movie with id = 1 along with all of the actors associated with that movie you do the following:
SELECT * FROM Movie
LEFT JOIN Movie_Actor ON Movie_Actor.movie_id = Movie.id
LEFT JOIN Actor ON Actor.id = Movie_Actor.actor_id
WHERE id = 1
To also get all the tags, keep joining the associated tables Movie_Tag and Tag.
You might think that this would be terribly inefficient because a lot of information is going to be duplicated, for example the name of a movie is going to be fetched not just once, but NA * NT times, where NA is the number of fetched actors and NT is the number of fetched tags.
Actually, databases tend to be smart about that, (precisely because this is a very popular mechanism of retrieving data with as few as possible roundtrips to the database,) so within their communication protocols they contain special measures to avoid transmitting field values that are identical from row to row. So, the actual amount of data transmitted is very close to exactly the amount of data that would have been transmitted if you queried each table separately.
The benefit, of course, is that you suffer the penalty of a single round-trip to the database, instead of several round-trips, one for each table.

How do I remove redundancy from this database layout's 1:M relationship and still be able to excerpt the data needed?

The database looks like this:
CREATE TABLE artists (
artist_id SERIAL PRIMARY KEY,
artist TEXT UNIQUE NOT NULL
);
CREATE TABLE artistalias (
artistalias_id SERIAL PRIMARY KEY,
artist_id SERIAL REFERENCES artists (artist_id),
alias TEXT UNIQUE NOT NULL
);
CREATE TABLE songs (
song_id SERIAL PRIMARY KEY,
song TEXT NOT NULL,
artist_id SERIAL REFERENCES artists (artist_id)
);
one artist can have zero, one or many aliases
one alias belongs to exactly one artist
one song has one artist
one artist can have one or many songs
My problem is that the database holds artists that are using two or more pseudonymes. For example one artist is using the stage name Assassin for songs that belong to one certain genre and the name Agent Sasco for songs that belong to another.
Some artists just randomly change their pseudonymes every now and then.
Later a website should display the data in the following format:
Artist | Song
------------+-------------------
Assassin | Anywhere We Go
Agent Sasco | We Dem A Watch
And when you click on the artist it will link you to a page showing all different aliases the artist has used to perform songs with.
It is important that the song is being displayed with the pseudonyme of the artist that it was released with.
The dummy data I work with looks like:
INSERT INTO artists (artist) VALUES
('Assassin'), ('Agent Sasco'), ('Sizzla');
-- This bothers me as its a lot of redundant data
INSERT INTO artistalias (artist_id, alias) VALUES
(1, 'Agent Sasco'), (2, 'Assassin');
INSERT INTO songs (song, artist_id) VALUES
('Anywhere We Go', 1), ('We Dem A Watch', 2), ('Only Takes Love', 3);
What bothers me with this database layout is that I have to add redundant data to artistalias. There must be a better way to link the table artists to artistalias and songs without having to add one specific artist and his aliases multiple times?
The query to display the data in the desired format looks like:
SELECT
artist AS pseudonyme_song_was_performed_with,
string_agg(alias, ' & ') AS other_pseudonymes,
song
FROM
artists
left JOIN artistalias USING (artist_id)
left JOIN songs USING (artist_id)
GROUP BY artist, song;
Here is a SQLFiddle with the layout and data as described above.
You should store the real name of the artist in the artists table.
Then store all the aliases for that artist in the artistalias table.
Then store the artistalias_id in the songs table.
In that way, you won't have any duplicate data.
CREATE TABLE artists (
artist_id SERIAL PRIMARY KEY,
artist TEXT UNIQUE NOT NULL
);
CREATE TABLE artistalias (
artistalias_id SERIAL PRIMARY KEY,
artist_id SERIAL REFERENCES artists (artist_id),
alias TEXT UNIQUE NOT NULL
);
CREATE TABLE songs (
song_id SERIAL PRIMARY KEY,
song TEXT NOT NULL,
artistalias_id SERIAL REFERENCES artistalias (artistalias_id)
);
Then insert the data this way:
INSERT INTO artists (artist) VALUES
('Jeffrey Campbell'), ('Sizzla');
-- THIS REDUNDANCY IS BOTHERING ME
INSERT INTO artistalias (artist_id, alias) VALUES
(1, 'Agent Sasco'), (1, 'Assassin');
INSERT INTO songs (song, artistalias_id) VALUES
('Anywhere We Go', 1), ('We Dem A Watch', 2);
And query this way:
SELECT
a1.alias AS pseudonyme_song_was_performed_with,
string_agg(a2.alias, ' & ') AS other_pseudonymes,
song
FROM
artistalias a1
left JOIN artistalias a2 on a2.artist_id = a1.artist_id
left JOIN songs s on s.artistalias_id = a1.artistalias_id
GROUP BY a1.alias, song;
Fiddle: http://sqlfiddle.com/#!15/3a78c/8/0

Matching delimited string to table rows

So I have two tables in this simplified example: People and Houses. People can own multiple houses, so I have a People.Houses field which is a string with comma delimeters (eg: "House1, House2, House4"). Houses can have multiple people in them, so I have a Houses.People field, which works the same way ("Sam, Samantha, Daren").
I want to find all the rows in the People table corresponding to the the names of people in the given house, and vice versa for houses belong to people. But I can't figure out how to do that.
This is as close as I've come up with so far:
SELECT People.*
FROM Houses
LEFT JOIN People ON Houses.People Like CONCAT(CONCAT('%', People.Name), '%')
WHERE House.Name = 'SomeArbitraryHouseImInterestedIn'
But I get some false positives (eg: Sam and Samantha might both get grabbed when I just want Samantha. And likewise with House3, House34, and House343, when I want House343).
I thought I might try and write a SplitString function so I could split a string (using a list of delimiters) into a set, and do some subquery on that set, but MySQL functions can't have tables as return values.
Likewise you can't store arrays as fields, and from what I gather the comma-delimited elements in a long string seems to be the usual way to approach this problem.
I can think of some different ways to get what I want but I'm wondering if there isn't a nice solution.
Likewise you can't store arrays as fields, and from what I gather the comma-delimited elements in a long string seems to be the usual way to approach this problem.
I hope that's not true. Representing "arrays" in SQL databases shouldn't be in a comma-delimited format, but the problem can be correctly solved by using a junction table. Comma-separated fields should have no place in relational databases, and they actually violates the very first normal form.
You'd want your table schema to look something like this:
CREATE TABLE people (
id int NOT NULL,
name varchar(50),
PRIMARY KEY (id)
) ENGINE=INNODB;
CREATE TABLE houses (
id int NOT NULL,
name varchar(50),
PRIMARY KEY (id)
) ENGINE=INNODB;
CREATE TABLE people_houses (
house_id int,
person_id int,
PRIMARY KEY (house_id, person_id),
FOREIGN KEY (house_id) REFERENCES houses (id),
FOREIGN KEY (person_id) REFERENCES people (id)
) ENGINE=INNODB;
Then searching for people will be as easy as this:
SELECT p.*
FROM houses h
JOIN people_houses ph ON ph.house_id = h.id
JOIN people p ON p.id = ph.person_id
WHERE h.name = 'SomeArbitraryHouseImInterestedIn';
No more false positives, and they all lived happily ever after.
The nice solution is to redesign your schema so that you have the following tables:
People
------
PeopleID (PK)
...
PeopleHouses
------------
PeopleID (PK) (FK to People)
HouseID (PK) (FK to Houses)
Houses
------
HouseID (PK)
...
Short Term Solution
For your immediate problem, the FIND_IN_SET function is what you want to use for joining:
For People
SELECT p.*
FROM PEOPLE p
JOIN HOUSES h ON FIND_IN_SET(p.name, h.people)
WHERE h.name = ?
For Houses
SELECT h.*
FROM HOUSES h
JOIN PEOPLE p ON FIND_IN_SET(h.name, p.houses)
WHERE p.name = ?
Long Term Solution
Is to properly model this by adding a table to link houses to people, because you're likely storing redundant relationships in both tables:
CREATE TABLE people_houses (
house_id int,
person_id int,
PRIMARY KEY (house_id, person_id),
FOREIGN KEY (house_id) REFERENCES houses (id),
FOREIGN KEY (person_id) REFERENCES people (id)
)
The problem is that you have to use another schema, like the one proposed by #RedFilter. You can see it as:
People table:
PeopleID
otherFields
Houses table:
HouseID
otherFields
Ownership table:
PeopleID
HouseID
otherFields
Hope that helps,
Hi you just change the table name places, left side is People and then right side is Houses:
SELECT People.*
FROM People
LEFT JOIN Houses ON Houses.People Like CONCAT(CONCAT('%', People.Name), '%')
WHERE House.Name = 'SomeArbitraryHouseImInterestedIn'

how to insert data using multiple tables

i have created a db names movielibrarysystem in which i have 3 tables..
that are type,publisher and movie... now 1 publisher could have many many movies and 1 movie is of many type.. in the movie table, the publisher id as well as the typeid are acting as a foreign keys..
my question is that how to insert a data into a movie table... i have already inserted data into publisher and type tables but i`m not able to insert into movie table..
Here's what you do. First, the relationship between publisher and movie is one to many - a publisher can publish many movies but each movie only has one publisher. However, type to movie is a many-to-many relationship (you state that one movie is of many types, but it's also the case that one type is of many movies), so you should have an extra table for that relationship.
Essentially:
publisher:
publisher_id
publisher_name
<other publisher info>
type:
type_id
type_name
<other type info>
movie:
movie_id
movie_name
publisher_id references publisher(publisher_id)
<other movie info>
movie_type:
movie_id references movie(movie_id)
type_id references type(type_id)
with suitable primary keys for all those.
Then assuming you have the publisher and type inserted, you can insert the movie thus:
begin transaction;
insert into movie (movie_name,publisher_id) values (
'Avatar',
(select publisher_id from publisher where publisher_name = 'Spielberg')
);
insert into movie_type (movie_id,type_id) values (
(select movie_id from movie where movie_name = 'Avatar'),
(select type_id from type where type_name = 'SciFi')
);
insert into movie_type (movie_id,type_id) values (
(select movie_id from movie where movie_name = 'Avatar'),
(select type_id from type where type_name = 'GrownUpSmurfs')
);
commit;
In other words, you use sub-selects to get the IDs from the relevant tables based on a unique set of properties (above example assumes movie names are unique, in reality you will need a more specific query, such as to handle the different films with the same name: The Omega Man, for example).
If you're not using a DBMS that supports selects in value sections, your best bet will be probably just to remember or extract the relevant values to a variable in whatever programming language you're using and construct a query to do it. In pseudo-code:
begin transaction;
insert into movie (movie_name,publisher_id) values (
'Avatar',
(select publisher_id from publisher where publisher_name = 'Spielberg')
);
select movie_id into :m_id from movie where movie_name = 'Avatar';
select type_id into :t_id1 from type where type_name = 'SciFi';
select type_id into :t_id2 from type where type_name = 'GrownUpSmurfs';
insert into movie_type (movie_id,type_id) values (:m_id, :t_id1);
insert into movie_type (movie_id,type_id) values (:m_id, :t_id2);
commit;
In response to your comment:
hey, i didn't got the point that of type and movie relationship.. will you please elaborate this point .. regards Abid
Both Avatar and Solaris can be considered of the type SciFi. So many movies to one genre. And Xmen:Wolverine can be considered both action and comic-remake. So many types to one movie.
Many-to-many relationships are best represented with a separate table containing the cross matches between the two related tables.
If you constrain tables type and publisher to use foreign keys you will need to first insert both ids into your movies table. If possible I would let the movies table be the one that increments the foreign keys.