SQL Database layout/design - sql

I'm creating a music player that is backed by a SQLite database. There's a songs table that has an id, title, artist, album, etc. I'm currently trying to make playlists and I'd like to know whether my design will be efficient. Initially I wanted to make a table of playlists and each playlist entry would have a list of song ids. I would then query the songs table for the list of song ids. Something along the lines of SELECT * FROM songs where id=this OR id=that OR id=..... However, I've just read up on joins so now I'm thinking that each playlist should be its own table and entries for a playlist table would just be ids from the songs table and I can do an inner join on the song id column between a specific playlist table and the songs table. Which method would be more efficient? Are they equivalent?

When you find yourself considering creating multiple identical tables holding similar data you probably should take a step back and rethink your design as this is contrary to the idea underlying the relational model. The same goes for the idea of storing "lists of ids", which I interpret to mean some kind of array of data, something that also is a bad fit for a good model as every item in a row (or tuple) should only store one value.
One possible design for your domain could be this:
Songs (SongID PK, SongAttributes (name, length etc) ...)
Playlists (PlaylistID PK, PlaylistAttributes (like name, owner etc) ...)
PlaylistSongs (PlaylistID FK, SongID FK)
PK = Primary Key, FK = Foreign Key
To select all songs songs for a certain playlist:
select songs.name
from songs
join playlistsongs on songs.songid = playlistsongs.songid
join playlist on playlist.playlistid = playlistsongs.playlistid
where playlist.name = '???'
As for your questions: Which method would be more efficient? Are they equivalent?
Neither would be good, and they're not equivalent. In the first example you would probably get problems with retrieving and updating data, and in the other the number of tables would be linear to the number of playlists, and you would have to inject every query you make with the suitable table name at run-time - this is really something that you do not want.

OR statements become inefficient very quickly, so a model with playlist and a playlist id, is an option.
Example:
SONG ----> PLAYLIST WITH LIST OF SONGIDs
Pseudo code:
CREATE TABLE SONG (
song_id INT,
name VARCHAR,
other attributes);
CREATE TABLE playlists (
playlist_id INT,
song_id INT REFERENCES FOREIGN KEY song(song_id),
other playlist attributes);
Then you can join the list:
SELECT a.*, b.*
FROM playlists a
INNER JOIN song b ON a.song_id=b.song_id AND a.playlist_id=?;
which will give you the playlist of a certain person owning playlist_id X.

Related

Extending table with another table ... sort of

I have a DB about renting cars.
I created a CarModels table (ModelID as PK).
I want to create a second table with the same primary key as CarModels have.
This table only contains the number of times this Model was searched on my website.
So lets say you visit my website, you can check a list that contains common cars rented.
"Most popular Cars" table.
It's not about One-to-One relationship, that's for sure.
Is there any SQL code to connect two Primary keys together ?
select m.ModelID, m.Field1, m.Field2,
t.TimesSearched
from CarModels m
left outer join Table2 t on m.ModelID = t.ModelID
but why not simply add the field TimesSearched to table CarModels ?
Then you dont need another table
Easiest is to just use a new primary key on the new table with a foreign key to the CarModels table, like [CarModelID] INT NOT NULL. You can put an index and a unique constraint on the FK.
If you reeeealy want them to be the same, you can jump through a bunch of hoops that will make your life Hell, like creating the table from the CarModels table, then setting that field as the primary key, then whenever you add a new CarModel you'll have to create a trigger that will SET IDENTITY_INSERT ON so you can add the new one, and remember to SET IDENTITY_INSERT OFF when you're done.
Personally, I'd create a CarsSearched table that holds ThisUser selected ThisCarModel on ThisDate: then you can start doing some fun data analysis like [are some cars more popular in certain zip codes or certain times of year?], or [this company rents three cars every year in March, so I'll send them a coupon in January].
You are not extending anything (modifying the actual model of the table). You simply need to make INNER JOIN of the table linking with the primary keys being equal.
It could be outer join as it has been suggested but if it's 1:1 like you said ( the second table with have exact same keys - I assume all of them), inner will be enough as both tables would have the same set of same prim keys.
As a bonus, it will also produce fewer rows if you didn't match all keys as a nice reminder if you fail to match all PKs.
That being said, do you have a strong reason why not to keep the said number in the same table? You are basically modeling 1:1 relationship for 1 extra column (and small one too, by data type)
You could extend (now this is extending tables model) with the additional attribute of integer that keeps that number for you.
Later is preferred for simplicity and lower query times.

SQL database design pattern for user favorites?

Asked this on the database site but it seems to be really slow moving. So I'm new to SQL and databases in general, the only thing I have worked on with an SQL database used one to many relationships. I want to know the easiest way to go about implementing a "favorites" mechanism for users in my DB-similar to what loads of sites like Youtube, etc, offer. Users are of course unique, so one user can have many favorites, but one item can also be favorited by many users. Is this considered a many to many relationship? What is the typical design pattern for doing this? Many to many relationships look like a headache(I'm using SQLAlchemy so my tables are interacted with like objects) but this seems to be a fairly common feature on sites so I was wondering what is the most straightforward and easy way to go about it. Thanks
Yes, this is a classic many-to-many relationship. Usually, the way to deal with it is to create a link table, so in say, T-SQL you'd have...
create table user
(
user_id int identity primary key,
-- other user columns
)
create table item
(
item_id int identity primary key,
-- other item columns
)
create table userfavoriteitem
(
user_id int foreign key references user(user_id),
item_id int foreign key references item(item_id),
-- other information about favoriting you want to capture
)
To see who favorited what, all you need to do is run a query on the userfavoriteitem table which would now be a data mine of all sorts of useful stats about what items are popular and who liked them.
select ufi.item_id,
from userfavoriteitem ufi
where ufi.user_id = [id]
Or you can even get the most popular items on your site using the query below, though if you have a lot of users this will get slow and the results should be saved in a special table updated on by a schedules job on the backend every so often...
select top 10 ufi.item_id, count(ufi.item_id),
from userfavoriteitem ufi
where ufi.item_id = [id]
GROUP BY ufi.item_id
I've never seen any explicitly-for-database design patterns (except a couple of trivial misuses of the phrase 'design pattern' when it became fashionable some years ago).
M:M relationships are OK: use a link table (aka association table etc etc). Your example of a User and Favourite sounds like M:M indeed.
create table LinkTable
(
Id int IDENTITY(1, 1), -- PK of this table
IdOfTable1 int, -- PK of table 1
IdOfTable2 int -- PK of table 2
)
...and create a UNIQUE index on (IdOfTable1, IdOfTable2). Or do away with the Id column and make the PF on (IdOfTable1, IdOfTable2) instead.

Inserting data into a table with new and old data from another two tables

I have a table name Queue_info with structure as
Queue_Id number(10)
Movie_Id number(10)
User_Id Varchar2(20)
Status Varchar2(20)
Reserved_date date
I have two other tables named Movie_info having a many columns including movie_Id and User_info having many columns including User_Id.
In the first table movie_id, user_id is foreign key from movie_info(movie_id) and user_info(User_id).
My problem is that if I insert any value either in the Movie_info or User_info, the Queue_info table should be updated as new entry for every user or for every movie
For example
If insertion in Movie_info as new movie then queue_info should be updated as for every user the status of that new movie is awaiting.
use from triggers. by using triggers you can update all related tables to your table. for example if 1 row inserted in to table 1, 1 row insert in to table 2 too.
Some notes first:
I really like that you have a standardized way to name tables and fields. I would use Queue instead of Queue_info, Movie instead of Movie_info, etc..., as all tables have information - don't they? - and we all know that. I'd also choose MovieId instead of Movie_Id, ReservedDate instead of Resedrved_date but that's a matter of personal taste (allergy to underscores).
What I wanted to stress is that choosing one way for naming and keeping it is very good.
What I don't like is that while your structure seems normalized, you use Varchar type for the User_id Key. Primary (and Foreign) Keys are best if they are small in size and with constant size. This mainly helps in keeping index sizes small (so more efficient) and secondly because the keys are the only values stored repeatedly in the db (so it helps keeping db size small).
Now, to your question, do you really need this? I mean, you may end up having in your database thousands of movies and users. Do you want to add a thousand rows in the Queue table whenever a new movie is inserted? Or another thousand rows when a new user is registered? Or 50 thousand rows when a new list with 50 new movies arrives (and is inserted in the db)?
With 10K movies and 2K users, you'll have a 20M rows table. There is no problem with a table of that size, and one or more triggers will serve your need. What happens if you have 100K movies and 50K users though? A 5G rows table. You can deal with that too, but perhaps you can just keep in that table only the movies that a user is interested in (or has borrowed or has seen, whatever the purpose of the db is). And if you want to have a list of movies that a certain user has not yet been interested in, check for those Movie_Id that do not exist in the table. with something like this:
SELECT
Movie_Id, Movie_Title
FROM
Movie_info AS m
WHERE
NOT EXISTS
( SELECT *
FROM Queue_info AS q
WHERE q.Movie_Id = m.Movie_Id
AND q.User_Id = #UserId
)

How to maintain subcategory in MYSQL?

I am having categories as following,
Fun
Jokes
Comedy
Action
Movies
TV Shows
Now One video can have multiple categories or sub categories, let's say VideoId: 23 is present in Categories Fun, Fun->Comedy, Action->TV Shows but not in Action category. Now I am not getting idea that hwo should I maintain these categories in Database. Should I create only one column "CategoryId AS VARCHAR" in Videos and add category id as comma-separated values (1,3,4) like this but then how I will fetch the records if someone is browsing category Jokes?
Or should I create another table which will have videoId and categoryid, in that case if a Video is present in 3 different categories then 3 rows will be added to that new table
Please suggest some way of how to maintain categories for a particular record in the table
Thanks
You categories table could have a column in it called parentID that reference another entry in the categories table. It would be a foreign key to itself. NULL would represent a top-level category. Something other then NULL would represent "I am a child category of this category". You could assign a video to any category still, top-level, child, or somewhere inbetween.
Also, use autoincrement notnull integers for your primary keys, not varchar. It's a performance consideration.
To answer your comment:
3 tables: Videos, Categories, and Video_Category
Video_Category would have VideoID and CategoryID columns. The primary key would be a combination of the two columns (a compound primary key)
You have two choices, parentID (better as INT) to refer to the parent or an extra table with categoryID - parentID.
The last one may provide a better logical separation and allows you to have multiple categories.
I suggest that create another table which will have videoId and categoryid. Then you can use sql-query as follow:
select a.*,GROUP_CONCAT(b.category_id) as cagegory_ids
from table_video a
left join table_video_category b on a.video_id=b.video_id
group by a.video_id

How to display multiple values in a MySQL database?

I was wondering how can you display multiple values in a database for example, lets say you have a user who will fill out a form that asks them to type in what types of foods they like for example cookies, candy, apples, bread and so on.
How can I store it in the MySQL database under the same field called food?
How will the field food structure look like?
You may want to read the excellent Wikipedia article on database normalization.
You don't want to store multiple values in a single field. You want to do something like this:
form_responses
id
[whatever other fields your form has]
foods_liked
form_response_id
food_name
Where form_responses is the table containing things that are singular (like a person's name or address, or something where there aren't multiple values). foods_liked.form_response_id is a reference to the form_responses table, so the foods liked by the person who has response number six will have a value of six for the form_response_id field in foods_liked. You'll have one row in that table for each food liked by the person.
Edit: Others have suggested a three-table structure, which is certainly better if you are limiting your users to selecting foods from a predefined list. The three-table structure may be better in the case that you are allowing them the ability to enter their own foods, though if you go that route you'll want to be careful to normalize your input (trim whitespace, fix capitalization, etc.) so you don't end up with duplicate entries in that table.
normally, we do NOT work out like this. try to use a relation table.
Table 1: tbl_food
ID primary key, auto increment
FNAME varchar
Table 2: tbl_user
ID primary key, auto increment
USER varchar
Table 3: tbl_userfood
RID auto increment
USERID int
FOODID int
Use similar format to store your data, instead a chunk of data fitted into a field.
Querying in these tables are easier than parsing the chunk of data too.
Use normalization.
More specifically, create a table called users. Create another called foods. Then link the two tables together with a many-to-many table called users_to_foods referencing each others foreign keys.
One way to do it would be to serialize the food data in your programming language, and then store it in the food field. This would then allow you to query the database, get the serialized food data, and convert it back into a native data structure (probably an array in this case) in your programming language.
The problem with this approach is that you will be storing a lot of the same data over and over, e.g. if a lot of people like cookies, the string "cookies" will be stored over and over. Another problem is searching for everyone who likes one particular food. To do that, you would have to select the food data for each record, unserialize it, and see if the selected food is contained within. This is a very inefficient.
Instead you'll want to create 3 tables: a users table, a foods table, and a join table. The users and foods tables will contain one record for each user and food respectively. The join table will have two fields: user_id and food_id. For every food a user chooses as a favorite, it adds a record to the join table of the user's ID and the food ID.
As an example, to pull all the users who like a particular food with id FOOD_ID, your query would be:
SELECT users.id, users.name
FROM users, join_table
WHERE join_table.food_id = FOOD_ID
AND join_table.user_id = users.id;