Redis, Playlists with pointers to movies, Is this possible? - redis

I have a collection of films, about 150 thousand, and I have several playlists where several films appear several times.
I want to cache movies and playlists on the Redis.
What I needed would be for the playlists to save only references to the movies, so when I retrieve a playlist it will bring the referenced films, but that if I change a movie it will be changed in all playlists.
I didn't want to retrieve the playlist with only ids and then have to retrieve each movie by id again, is there a way to do this directly in one query? as if it were a pointer ...
Thanks!

As I see it you have 2 options:
Writing a lua script that will give you, for a given playlist, all the movies in this playlist. The main disadvantage in this approach is that it will not work on cluster cause the movies might located on different shard then the playlist.
Use RedisGears (https://github.com/RedisGears/RedisGears). In a single python line you can do exactly what you need and it will work perfectly on cluster. It will look something like this (assuming playlist is a redis list of movies located in redis hashes):
GB().flatmap(lambda r: r).repartition(lambda r:r).map(lambda r: execute('hgetall', r)).run('< playlist_key_name >')
flatmap - to have a record per movie
repartition - so each movie record will go to the shard where it located
map - to perform hgetall and get the movie data
run - run will automatically collect all the data from all the shards and return them.

Related

Photo gallery DataBase structure

So, I'm building a large scale photo gallery and I'm a bit puzzled when it comes to building and structuring the DataBase. Having little experience with noSQL DB's, this seems to be a big step up.
Important to mention, that the DB will hold only url ref's to the photos, which will be stored in a cloud.
Basically, I want each user to have a few photo albums, and in each album around 3000 photos. I want to let the user filter each album fast and efficiently, but no more than one album to filter (meaning he cant search all his photos at once).
My 2 main question here are:
Which will be more suitable- SQL or noSQL?
Storing photos:
Should I store photos per album, meaning giving each albums an array field, which will include 3000 photo objects.
Or should I store photos as a separate collection/table and ref each to its album?
Keep in mind filtering efficiently is a high priority.
Any specific DB recommendation will highly appreciated :)
Thank you
I would think that you would want a SQL database that supports binary objects for this such as MariaDB which is quite efficient for online/web applications. I would guess the basic database structure would be something like this :-
create table ALBUMS (
user_id integer,
album_id integer,
album_name text
)
create table PHOTOS (
album_id integer,
photo_name text,
photo_data blob
)
Obviously you will want to think about keys and indices to make this more efficient and no doubt you will have additional meta data to add as extra columns. This assumes that the albums do not have a fixed order for the photos. If they do you will need a
column for that and will want to SORT BY that column in your select statement.

Use DB Relation To Avoid Redundancy

I have designed an ERD of movies and tv series which is confidential. I can give you an overview of database.
It has more then 20 tables (more tables will be added later) and it is normalized. I have tables like Movie, Actors, Tv Seriers, Director, Producer etc. So these tables will contain most important information and also these tables are connected (by foreign keys and middle tables like MovieActor, MovieDirector etc).
So the scenario is like
1) The standard “starting” database should have Actors, Directors, Producers, Music Composers, Genres, Resolution Types… pre populated and pre defined by the Admin.
2) For every user creating his personal movie collection, he will be starting of his database with all the pre defined data, but if he wants to, he may add further data to his personal database. These changes will only be affecting his database and not the standard "starting" database (which was defined by Admin).
3) The Admin should have a separate view to add Actors, Directors, Producers… that will become part of the standard "starting" database. Any further changes done to this database will be available to the users as updates.
Suggested Solution
Question
The suggested solution is seems like I have to create new databases all the time for each user which seems not possible. My question is how can I manipulate the suggested solution so that my solution will be effective and possible. I would prefer to handle the situation by using database relations, not by separate storage.
You wouldn't create multiple databases, you would simply add an ownerId field to all relevant tables - admin would have ownerId = 0, indicating the row is part of the 'starting database' and new admin entries are instantly available to users.
In any output for a user where you want to display the starting data and their own, you would add WHERE (ownerId = 0 or ownerId = userId) to the appropriate query or if they need to see just their own, just ownerId = userId.
Presumably, they would be able to create relationships between their own data or 'starting' data and this approach should still work.
Foreign keys will still work but deleting will delete user data - basically you should only ever add to the starting data, not take away or you will run into problems.

Database design: a many-to-one design where order matters

I have two tables, User and Game. Game has columns sideRed and sideBlue. Each side has exactly one user. User has column activeGame. If sideRed and sideBlue are one-to-one relationships, then where does the back reference activeGame go?
There are many users and many games. User should be connected to games somehow. This is typical m:n-relationship.
In your case this is restricted: Each game has exactly one user as sideRed and one user as sideBlue. At the moment your game table has two FK-columns to user-table to reference the blue and the red user. Correct so far?
Ask yourself some questions:
Can a game connect to more than these two users (maybe later)?
Can a user connect to several games (probably, as you are looking for a place to mark the active game)
Is the user allowed to play more than one game (of the same type) actively (maybe later)?
If you have several different games: Can a user take part in several different games at the same time? If yes: Are there several acitvely played different games where you'd need more than one activeGame flag?
You should always consider your ideas to grow :-)
You can put a fk-column into user-table to reference the active game. The problem: Your user must exist to fill the red and the blue column of the game row, but the game must exist to fill the activeGame column at the user. This cross reference needs special efforts on inserts...
You can set two BIT columns besides the sideRed and the sideBlue to mark this reference as the activeReference. In this case you'd have to make sure, that you do not allow more than one active flags per user.
My suggestion (see update-section!)
Place a mapping table in between
table game (just meta data to describe the game, no instance data)
table user (just meta data to describe the user, no instance data)
table UserGame (UserID, GameID, TypeID [red,blue,...] ...)
table Session (the actively played game: UserGameID, loginTime, ...)
A wise man said: A good database is to be reckognized on the count of its tables. The more, the better :-)
Well, the more the better might not be a general rule, but - in most cases - one should not be afraid to invest a bit more in a good and scaleable structure
UPDATE
Your comment
I like this solution because it has strong separation of concerns.
Let's say Shnugo made a move, and we must broadcast to all users.
Shnugo's client sends a UserGameID and the requested move. Then you'd
1. Query for UserGame matching UserGameID. 2. Query Game matching GameID and apply move. 3. Query UserGames matching GameID and follow
the join to get a list of UserIDs in the game. Is this correct?
The game table in my design is just a meta-description of the game itself (Name, Icon, some rules ...). The status of a specific game (e.g. current positions of all chess figures) would need one (or several) more table(s), while the UserGame-table should hold a reference to this GameStatus-table instead of a GameID). You might need several different tables, as the status of Chess will need other structures than the status of Poker. But - looking into UserGame will tell you which game is played and so you know which table to look into.
New suggestion
table Game (just meta data to describe the game, no instance data)
table User (just meta data to describe the user, no instance data)
table GameInstance (status of currently played game, GameID,StatusID, ...)
game specific table(s) to store the game's current status
table UserGame (UserID, GameInstanceID, TypeID [red,blue,...] ...)
table Session (the actively played game: UserGameID, loginTime, ...)
Another possibility was to store all moves and calculate the current status. But this will not work for any kind of game.

Linking table entry to multiple other entries

I'm attempting to create a basic music database for a school project.
I would like to link each song to three (3) other 'similar' songs.
I know how to link two tables together with FOREIGN KEY but I'm unsure how to link two entries within the same table.
The programs I'm using are PHPmyadmin and DBDesigner 4.
Thanks in advance for any assistance :)
First of all, when designing a database, you never want to "assume '(3)'." In other words, you don't want "repeating groups" such that the database design would be broken if you ever needed 4.
To me, "is-similar-to" is a many-to-many relationship that would list an arbitrary number of songs that are similar, with a structure such as
SONG_ID_1,
SONG_ID_2,
DEGREE_OF_SIMILARITY (some kind of percentage ...?)
So, for any song, you'd be looking to this table to find all songs that have ever been listed as "similar to" this song. You'd incorporate this table with an INNER JOIN, and be prepared to deal with an arbitrary number of matches.
You're looking for a relationship table, where you keep the song id along with the linked song id.
Example:
Table Song - ID PRIMARY KEY, NAME
Table Song_Link - ID_SONG (from Table Song), ID_LINKED_SONG (from Table Song)
This way you can store the link between both songs on a row basis.
Take into consideration that the link goes both ways.

Join Table vs Array of ids: PSQL

Let's say I have two models, songs and users, and I want to let users favorite songs.
One way to do this would be to create a join table between users as songs, let's call it favorites. Each row of this table would just have a user id and song id and its own id. I have some experience with this method and it works fine as far as I know.
However, I was thinking that a second way you could implement favorites would be adding a column onto the user model that consists of an array of song ids. Each id would match a song that the user had favorited.
I'm wondering which of these solutions is preferable and why.
If your site have huge amount of traffic then you should go with materialized views database design concept.