I have a large number of movies and TV series, which I currently keep track of in an MS Excel worksheet. Due to the large number of records and various data required, it is no longer a convenient option, so I want to switch to a MYSQL database, accessed through a GUI programmed in Java using Netbeans IDE.
I have the following tables in Excel:
Media_Library,
To_Be_Watched,
Statistics,
Wish_List,
Orders
Each film and TV series in my collection is in the Media_Library table, which has the following fields:
Sorting_Title
Title
Collection
Genre
Release_Year
Director
Age_Rating
Country
Runtime (min)
Watched
Media_Type
Format
For example: 'Alien 2', 'Aliens', 'Alien: Anthology', 'Action/Horror/Sci-Fi', 1986, 'James Cameron', 'M', 'America', 137, 'Yes', 'Movie', '4K UHD'
I'm stuck on what to do for the following fields: Genre, Director, Country, Runtime
Those 4 fields can each have multiple values, and I don't know how best to handle that; e.g. most films only have 1 runtime, but many have multiple (2 of the films have 4 different cuts). Also anthology films can have something like 6 different directors. I want to include all relevant genres, directors, countries and runtimes, but I don't know how to best do that.
I've tried adding a column for each value; genre1, genre2, ... This results in many blank values though. In the spreadsheet in Excel I put all applicable genres in a single field as one string, e.g. 'comedy/horror'.
What would be the easiest way to resolve this issue? Can I do a many-to-many relationship to achieve what I want?
Simply put a hard limit on the amount of genres.
For instance, while you may want the user to be able to enter as many genres as they want, is it rational to go above 20 genres? That doesn't make much sense and will only make searches much more time intensive.
For other possible duplicates, you can do something like this (in sqlite3 at least):
CREATE TABLE IF NOT EXISTS Directors
(id INTEGER PRIMARY KEY,
director TEXT,
UNIQUE(director) ON CONFLICT IGNORE)
CREATE TABLE IF NOT EXISTS file
(file_id INTEGER PRIMARY KEY,
filename TEXT,
director_id INTEGER,
watched INTEGER,
FOREIGN KEY (director_id)
REFERENCES Directors (id)
ON UPDATE CASCADE
ON DELETE SET NULL)
It doesn't matter if more than one genre have the same director, just as long as the 'file' table knows which one it's referencing and staying updated.
The 'watched' column holds a type of value that doesn't make sense to create an individual table for. For instance, say a song's track number is 2. Creating a table just for track numbers to reference doesn't make sense because you're going to spend a point in that table, then spend another point in the 'file' table to reference. So, just spend 1 point and put in the 'file'.
https://www.sqlitetutorial.net/sqlite-foreign-key/
Generally, you would add a second table, Directors, for instance, and then you relate that back to the movie title. You will need a uniqueID for the movie, and you do a join where that uniqueID is referenced in the Directors table, something like this working demo (not all fields were included in my demo):
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=9e8d73bc798767b56b974ab4ebd30517
SELECT m.id, m.title, d.director
FROM Media_Library m
JOIN Directors d ON (m.id = d.mediaID);
or to concatenate the directors:
SELECT m.id, m.title, group_concat(d.director) as directors
FROM Media_Library m
JOIN Directors d ON (m.id = d.mediaID)
GROUP BY m.id;
Usually when you make this kind of relationship you will define a foreign key restraint, creating a link between the primary key of one table and a key (or keys) in another. In this case, the link is between id in Media_Library and mediaID in directors, so you would alter the create statement like this:
CREATE TABLE Directors (
id int not null auto_increment,
mediaID int,
director varchar(50),
PRIMARY KEY(id),
FOREIGN KEY (mediaID) REFERENCES Media_Library(id)
);
The foreign key is not strictly necessary, but it can reinforce database integrity. The ins and outs of foreign keys are out of scope for this answer, but you should probably read about them.
It is also possible to store the data in a JSON field since v5.7, like this:
CREATE TABLE test.Media_Library (id int not null auto_increment, title varchar (50), director JSON, PRIMARY KEY (id));
INSERT INTO test.Media_Library (title, director) VALUES
('Alien', json_array("Scott", "Scorsese")),
('The Alienist', '["Tarantino", "Nolan", "Kubrick"]'),
('Alien 2', '["Scott"]');
SELECT * FROM test.Media_Library;
https://www.db-fiddle.com/f/tG1SZorEHEYi5cYwgXjPeY/1
In the second query in that fiddle, I select only the first director from the list:
SELECT id, title, director->>'$[0]' as firstDirector
FROM Media_Library;
There are advantages to storing data this way, but there are tradeoffs, and unless you know what you are doing or you have a specific reason to be using JSON fields (for instance, you are getting the data from an api as JSON and you just want to use it as is), I would stick with the join method. Also, storing arrays is inherently non-normal (read about database normalization, a quick overview on wiki: https://en.wikipedia.org/wiki/Database_normalization).
Let me try to make it simple with an example.
I am creating a database with 2 tables, School and Students
School table with 2 columns SchoolID(PK) and Location
Student table with 3 columns StudentID(PK), SchoolID(FK) and Grade
The Student table will have students from all the schools which can be identified by the foreign key SchoolID. The StudentID column will be unique to each student across all schools. Well and good so far.
I need another column in Student table which will be unique only with respect to the foreign key value. Let's call this ID. I need this column to be unique only to each SchoolID. So, If I filter out the students belonging to a particular SchoolID, I shouldn't get 2 students with same ID.
I'm not sure if that's a valid scenario.
Edit:
This is good
This is not
I think there is something wrong with the datamodel. Since you have StudentID as primary key in Student-table, this column will always be unique in this table. But it seems like you are trying to create a Student-School table where you can have the same student connected to multiple schools (but no the same school multiple times for one single student). I think you need at least 3 tables:
Student (Pk StudentID)
School (Pk SchoolId)
StudentSchool
The StudentSchool table will have two FK-columns: StudentID and SchoolID. To protect from the same student beeing mapped to the same school multiple times you could either have the PK include StudentId and SchoolId or create a Unique constraint.
Lets say I have a table of Classrooms.
Classroom Table
Each Classroom has its own set of Students specific to that Classroom. What would be the best way to set this up? Should I…
A. Make a separate Student table for each Classroom? How would I assign a Classroom to a table though?
B. Make one big list of Students each with their own Classroom FK? What if there are millions of Students and you are only looking for Students of a specific Classroom?
I am new to SQL btw
I would suggest one table for each entity and a join table to describe the relationship between the two. So one table for all the students, another table for all the classrooms and another table for joining.
Minimal example
STUDENT table has columns id (integer, primary key) and name (varchar).
CLASSROOM table has columns id (integer, primary key) and description (varchar).
STUDENT_CLASSROOM has integer columns id (primary key), student_id and classroom_id.
This way students can be assigned to classrooms (or classrooms can be assigned to students) and you can declare your foreign keys as appropriate.
I have a SQLite database with the following tables:
There is a Persons table (ID, name,...) and a Pets table (ID, name, owner,...), where owner is a foreign key that refers to a row of Persons.
A person can have as many pets as they want, but each pet must have an individual name.
The problem is that the name must be unique only for the same owner, a name can occur more than once in the Pets table, but only once per ID of the person stored in the owners column.
Therefore, the name column cannot be unique, but owner and name together should be.
How can this be achieved in SQLite?
You simply want a unique constraint or index:
create unique index unq_pets_owner_name on pets(owner, name);
I'm building a movie database with two tables, one containing movie names and the other containing actor names. How do I cross reference the data?
For example, should entries in the movie names table contain a list of actors for each movie, or should entries in the actor names table contain a list of movies for actor?
You will need an additional table which links the two, and contains at least the movie id and actor id as its columns. Assuming your Actors table has actor_id as its primary key, and Movies has movie_id as its primary key. This table could also include information about the actor specific to that movie, for example, the character's name or other character-related info.
CREATE TABLE actors_in_movies (
/* Use whatever auto increment keyword is needed for your RDBMS */
id INT NOT NULL PRIMARY KEY <auto increment>,
actor_id INT,
movie_id INT,
character_name VARCHAR(),
other_character_info VARCHAR(),
character_best_quote TEXT,
FOREIGN KEY (actor_id) REFERENCES actors (actor_id),
FOREIGN KEY (movie_id) REFERENCES actors (movies_id)
);
To query all the movies an actor appears in,
use something to the effect of:
SELECT
actors.*,
movies.name
FROM
actors
JOIN actors_in_movies ON actors.actor_id = actors_in_movies.actor_id
JOIN movies ON movies.movie_id = actors_in_movies.movie_id
WHERE actors.actor_id = <some actor id>
To query all the actors in a particular movie,
use something to the effect of:
SELECT
actors.*
FROM
actors
JOIN actors_in_movies ON actors.actor_id = actors_in_movies.actor_id
JOIN movies ON movies.movie_id = actors_in_movies.movie_id
WHERE movies.movie_id = <some movie id>
To add an actor to a movie, insert a row into your actors_in_movies table:
INSERT INTO actors_in_movies (actor_id, movie_id) VALUES (..., ...);
Use a cross table! Have a table which has only two columns: movieID, actorID.
You should have a third table, one called ActorInMovie. This table should have 2 columns. ActorID, MovieID.
This way, you'll be able to have a many to many rel.