Database design: a many-to-one design where order matters

Database design: a many-to-one design where order matters - sql

I have two tables, User and Game. Game has columns sideRed and sideBlue. Each side has exactly one user. User has column activeGame. If sideRed and sideBlue are one-to-one relationships, then where does the back reference activeGame go?

There are many users and many games. User should be connected to games somehow. This is typical m:n-relationship.
In your case this is restricted: Each game has exactly one user as sideRed and one user as sideBlue. At the moment your game table has two FK-columns to user-table to reference the blue and the red user. Correct so far?
Ask yourself some questions:
Can a game connect to more than these two users (maybe later)?
Can a user connect to several games (probably, as you are looking for a place to mark the active game)
Is the user allowed to play more than one game (of the same type) actively (maybe later)?
If you have several different games: Can a user take part in several different games at the same time? If yes: Are there several acitvely played different games where you'd need more than one activeGame flag?
You should always consider your ideas to grow :-)
You can put a fk-column into user-table to reference the active game. The problem: Your user must exist to fill the red and the blue column of the game row, but the game must exist to fill the activeGame column at the user. This cross reference needs special efforts on inserts...
You can set two BIT columns besides the sideRed and the sideBlue to mark this reference as the activeReference. In this case you'd have to make sure, that you do not allow more than one active flags per user.
My suggestion (see update-section!)
Place a mapping table in between
table game (just meta data to describe the game, no instance data)
table user (just meta data to describe the user, no instance data)
table UserGame (UserID, GameID, TypeID [red,blue,...] ...)
table Session (the actively played game: UserGameID, loginTime, ...)
A wise man said: A good database is to be reckognized on the count of its tables. The more, the better :-)
Well, the more the better might not be a general rule, but - in most cases - one should not be afraid to invest a bit more in a good and scaleable structure
UPDATE
Your comment
I like this solution because it has strong separation of concerns.
Let's say Shnugo made a move, and we must broadcast to all users.
Shnugo's client sends a UserGameID and the requested move. Then you'd
1. Query for UserGame matching UserGameID. 2. Query Game matching GameID and apply move. 3. Query UserGames matching GameID and follow
the join to get a list of UserIDs in the game. Is this correct?
The game table in my design is just a meta-description of the game itself (Name, Icon, some rules ...). The status of a specific game (e.g. current positions of all chess figures) would need one (or several) more table(s), while the UserGame-table should hold a reference to this GameStatus-table instead of a GameID). You might need several different tables, as the status of Chess will need other structures than the status of Poker. But - looking into UserGame will tell you which game is played and so you know which table to look into.
New suggestion
table Game (just meta data to describe the game, no instance data)
table User (just meta data to describe the user, no instance data)
table GameInstance (status of currently played game, GameID,StatusID, ...)
game specific table(s) to store the game's current status
table UserGame (UserID, GameInstanceID, TypeID [red,blue,...] ...)
table Session (the actively played game: UserGameID, loginTime, ...)
Another possibility was to store all moves and calculate the current status. But this will not work for any kind of game.

Related

Create simple database for chess tournaments

I am trying to make simple app for chess tournaments, but I have problem with database, I have users that participate in tournament (thats fine) but how do I give users to the round and match, should i make another relations user_tournament-round-tournament, user_tournament-match-round?

Please see this answers a food for though rather than a solution. In your question there is not enough information to fully cover all use cases, so the answer below contains a lot of speculation.
In my over simplistic view and picking up on your initial model, the tournament_competitors (renaming from user_tournament as we have competitors and not users) table would create a unique id for each enrolled competitor. This id would be used as a reference in a tournament_matches table (the table would link twice to the tournament_competitors this table would connect two opponents - constraint warning). The table would also register the match type.
For the match type, I see two possibilities.
The matches table would list all possible match types (final, semi-final, quarter-final, elimination rounds, etc.) and these would be referred to in the tournament_matches table via id (composite key in the form tournament_id-competitor_id-group_id). This approach, specially for the elimination round matches, requires the need to find a way to link the number of competitors in each elimination group with then number of matches each competitor has to through before they are considered eliminated or not - creating a round number. I see this as a business logic part so not on the DB. The group_id also needs to be calculated and it would be best done on the application.
The alternative is to have the various match types in the tournament_matches table as a free field - populated by the application. The tournament structure (Number of Groups, number of opponents in each group, etc.) would be defined as attributes in the tournaments table. In this view there is no need for the rounds table.

Use DB Relation To Avoid Redundancy

I have designed an ERD of movies and tv series which is confidential. I can give you an overview of database.
It has more then 20 tables (more tables will be added later) and it is normalized. I have tables like Movie, Actors, Tv Seriers, Director, Producer etc. So these tables will contain most important information and also these tables are connected (by foreign keys and middle tables like MovieActor, MovieDirector etc).
So the scenario is like
1) The standard “starting” database should have Actors, Directors, Producers, Music Composers, Genres, Resolution Types… pre populated and pre defined by the Admin.
2) For every user creating his personal movie collection, he will be starting of his database with all the pre defined data, but if he wants to, he may add further data to his personal database. These changes will only be affecting his database and not the standard "starting" database (which was defined by Admin).
3) The Admin should have a separate view to add Actors, Directors, Producers… that will become part of the standard "starting" database. Any further changes done to this database will be available to the users as updates.
Suggested Solution
Question
The suggested solution is seems like I have to create new databases all the time for each user which seems not possible. My question is how can I manipulate the suggested solution so that my solution will be effective and possible. I would prefer to handle the situation by using database relations, not by separate storage.

You wouldn't create multiple databases, you would simply add an ownerId field to all relevant tables - admin would have ownerId = 0, indicating the row is part of the 'starting database' and new admin entries are instantly available to users.
In any output for a user where you want to display the starting data and their own, you would add WHERE (ownerId = 0 or ownerId = userId) to the appropriate query or if they need to see just their own, just ownerId = userId.
Presumably, they would be able to create relationships between their own data or 'starting' data and this approach should still work.
Foreign keys will still work but deleting will delete user data - basically you should only ever add to the starting data, not take away or you will run into problems.

Is there anything fundamentally wrong with my database relations table?

Would this be the correct layout for a diagram as such? A few of these tables share the same primary key, but I am not sure if this is the best practise/correct relationships that I should set out.
It's for a local level, whereby players don't change teams and assuming that player positions are final. The aim is to gather statistics to show later for analysis.

The Squad table should be a linking table that creates a many to many relationship between Players and Team. Since each Player/Team combination can occur only once, both columns Team_ID and Player_ID should be part of the primary key.
Squad should be on the n-side of two relationships. Its name should probably be something like Membership.
Why do you need a separate PlayerStatistics table? Apparently it stores statistics for the same Player_ID/Team_ID combinations as Squad. The fields of this table should go to the Squad table.
Shouldn't the Positions be per membership? One position per membership, i.e. one player has one defined position in each team, in which case Position_ID should be a column in Squad.
There should be two relationships between Team and MatchStatistics. One on Home_team_ID and one on Away_team_ID.
Alternatively you could associate the PlayerStatistics to Player and Match and thus store what each player has done in each single game. You would then retrieve the overall player statistics or the player-per-team statistics through appropriate queries.

Messed up in my mind unless you have some strange requirements.
With this design a squad is limited to a single player.
Team_ID is associated with Statistics (not player). If you want a player associated with a single team then do that in Players. And then you should actually merge Statistics with Players.
A link on PK to PK between two tables is rarely a proper design.
If you want a player to be able to play on multiple teams then have PlayerID, TeamID a composite key in Statistics.
You need to disclose the requirements for a proper review. Squad is clearly messed up but you have not stated the purpose of squad.

SSIS Population of Slowly Changing Dimension with outrigger

Working on a data warehouse, a suitable analogy for the problem is that we have Healthcare Practitioners. Healthcare Practitioners have a number of professional attributes and work in an open number of teams and in an open number of clinical areas.
For example, you may have a nurse who works in children's services across a number of teams as a relief/contractor/bank staff person. Or you may have a newly qualified doctor who works general medicine who is doing time in a special area pending qualifying as a consultant of that special area.
So we have an open number of areas of work and an open number of teams, we can't have team 1, team 2 etc in our dimensions. The other attributes may change over time also, like base location (where they work out of), the main team and area they work in..
So, following Kimble I've gone for outriggers:
Table DimHealthProfessionals:
Key (primary key, identity)
Name
Main Team
Main Area of Work
Base Location
Other Attribute 1
Other Attribute 2
Start Date
End Date
Table OutriggerHealthProfessionalTeam:
HPKey (foreign key to DimHealthPRofessionals.Key)
Team Name
Team Type
Other Team Attribute 1
Other Team Attribute 2
Table OutriggerHealthProfessionalAreaOfWork:
HPKey (as above)
Area of Work
Other AoW attribute 1
If any attribute of the HP changes, or the combination of teams or areas of work in which they work change, we need to create a new entry in the SCD and it's outrigger tables to encapsulate this.
And we're doing this in SSIS.
The source data is basically an HP table with the main attributes, a table of areas of work, a table of teams and a pair of mapping tables to map a current set of areas of work to an HP.
I have three data sources, one brings in the HCP information, one the areas of work of all HCPs and one the team memberships.
The problem is how to run over all three datasets to determine if an HP has changed an attribute, and if they have changed an attribute, how we update the DIM and two outriggers appropriately.
Can anyone point me at a best practice for this? OR suggest an alternative way of modelling this dimension?

Admittedly I may not understand everything here, but it seems to me that the relationship in this example should be reversed. Place TeamKey and the WorkAreaKey in the dimHealthProfessionals -- this should simplify things.
With this in place, you simply make sure to deliver outriggers before the dimHealthProfessionals.
Treat outriggers as dimensions in their own right. You may want to treat dimHealthProfessionals as a type 2 dimension, to properly capture the history.
EDIT
Considering that team to person is many-to-many, a fact is more appropriate.
A column in a dimension table is appropriate only if a person can belong to only one team at a time. Same with work areas.

The problem is how to run over all three datasets to determine if an HP has changed an attribute, and if they have changed an attribute, how we update the DIM and two outriggers appropriately.
Can anyone point me at a best practice for this? OR suggest an alternative way of modelling this dimension?
I'm not sure I understand your question fully. If you are unsure about change detection, then use Checksums in the package. Build up a temp table with the data as it is in the source, then compare each row to its counterpart (joined via the business keys) by computing the checksum for both rows and comparing those. If they differ, the data has changed.
If you are talking about cascading updates in a historized dimension hierarchy (and you can treat the outriggers like a hierarchy in this context) then the foreign key lookups will automatically lookup the newer entry in DimHealthProfessionals if you have a historization (i.e. have validFrom / validThrough timestamps in DimHealthProfessionals). Those different foreign keys result in a different checksum.

Best Schema to represent NCAA Basketball Bracket

What is the best database schema to represent an NCAA mens basketball bracket? Here is a link if you aren't familiar: http://www.cbssports.com/collegebasketball/mayhem/brackets/viewable_men
I can see several different ways you could model this data, with a single table, many tables, hard-coded columns, somewhat dynamic ways, etc. You need a way to model both what seed and place each team is in, along with each game and the outcome (and possibly score) of each. You also need a way to represent who plays who at what stage in the tournament.
In the spirit of March Madness, I thought this would be a good question. There are some obvious answers here, and the main goal of this question is to see all of the different ways you could answer it. Which way is best could be subjective to the language you are using or how exactly you are working with it, but try to keep the answers db agnostic, language agnostic and fairly high level. If anyone has any suggestions on a better way to word this question or a better way to define it let me know in the comments.

The natural inclination is to look at a bracket in the order the games are played. You read the traditional diagram from the outside in. But let's think of it the other way around. Each game is played between two teams. One wins, the other loses.
Now, there's a bit more to it than just this. The winners of a particular pair of games face off against each other in another game. So there's also a relationship between the games themselves, irrespective of who's playing in those games. That is, the teams that face off in each game (except in the first round) are the winners of two earlier games.
So you might notice that each game has two "child games" that precede it and determine who faces off in that game. This sounds exactly like a binary tree: each root node has at most two child nodes. If you know who wins each game, you can easily determine the teams in the "parent" games.
So, to design a database to model this, you really only need two entities: Team and Game. Each Game has two foreign keys that relate to other Games. The names don't matter, but we would model them as separate keys to enforce the requirement that each game have no more than two preceding games. Let's call them leftGame and rightGame, to keep with the binary tree nomenclature. Similarly, we should have a key called parentGame that tracks the reverse relationship.
Also, as I noted earlier, you can easily determine the teams that face off in each game by looking at who won the two preceding games. So you really only need to track the winner of each game. So, give the Game entity a winner foreign key to the Team table.
Now, there's the small matter of seeding the bracket. That is, modeling the match-ups for the first round games. You could model this by having a Game for each team in the overall competition where that team is the winner and has no preceding games.
So, the overall schema would be:
Game:
winner: Team
leftGame: Game
rightGame: Game
parentGame: Game
other attributes as you see fit
Team:
name
other attributes as you see fit
Of course, you would add all the other information you'd want to the entities: location, scores, outcome (in case the game was won by forfeit or some other out of the ordinary condition).

For a RDBMS, I think the simplest approach that's still flexible enough to accommodate the majority of situations is to do the following:
Teams has [team-id (PK)], [name], [region-id (FK to Regions)], [initial-seed]. You will have one entry for each team. (The regions table is a trivial code table with only four entries, one for each NCAA region, and is not listed here.)
Participants has [game-id (FK to Games)], [team-id (FK to Teams)], [score (nullable)], [outcome]. [score] is nullable to reflect that a team might forfeit. You will have typically have two Participants per Game.
Games has [game-id (PK)], [date], [location]. To find out which teams played in a game, look up the appropriate game-id in the Participants table. (Remember, there might be more than two teams if someone dropped out or was disqualified.)
To set up the initial bracket, match the appropriate seeds to each other. As games are played, note which team has outcome = Winner for a particular game; this team is matched up against the winner of another game. Fill in the bracket until there are no more winning teams left.

Since you didn't specify RDBMS, I'm gonna be a little different and go with a CouchDB approach since I was reading about that this weekend. Here's the document structure I've come up with a represent a game.
{
"round" : 1, //The final would be round 5, and I guess Alabama St. vs. Morehead would be 0
"location" : "Dayton, OH",
"division": "South",
"teams" : ["UNC", "Radford"] //A feature of Couch is that fields like teams don't need a fixed nuber of columns.
"winner" : "UNC" //Showing my bias
}
A more interesting or complete application might have data for teams, rankings, and the like stored somewhere as well. John's approach covers that angle well, it seems. I welcome any comments from people who know better on my Couch skills.

I created a small system with the following tables:
Games: GameId, TournId, RoundId, Sequence, Date, VisitorId, VisitorScore, HomeId, HomeScore, WinnerId, WinnerGameId, WinnerHome (bit)
Predictions: PredId, UserId, GameId, PredVisitorId, PredHomeId, PredWinnerId
Rounds: RoundId, TournId, RoundNum, Heading1, Heading2
Teams: TeamId, TournId, TeamName, Seed, MoreInfo, Url
Tournaments: TournId, TournDesc
Users: TournId, UserName
WinnerGameId connects the winner of a game to their next game. WinnerHome tells whether the winner is the home or visitor of that next game. Other than that, I think it's pretty self explanatory.

Proposed Model
Proposed ER Diagram http://img257.imageshack.us/img257/1464/ncaaer.jpg
Team Table
All we need to know about a team is the name and seed. Therefore we need a "Team" table to store the seed value. The only candidate key is team name so we will use that as the primary to keep things simple. NCAA team names are unlikely to change over the course of a single tournament or contain duplicates so it should be an adequate key.
MatchUp Table
A "MatchUp" table can be used to pair the teams into each of the match ups. Foreign Keys (FK1, FK2) to the "Team" will ensure that the teams exist and a primary key over these values ensures that teams are only matched up against each other once.
A foreign key (FK4) to the "Team" table from the "MatchUp" table will record the winner. Logically the winner would need to be one of the two teams participating in the match up. A check constraint against the primary key could ensure this.
Once the outcome of a match up has been determined the Victor's seed could be retrieved from the team table in order to compare against other Victor's in order to determine subsequent match ups. Upon doing so an FK (FK3) to the resulting match up can be written to the determining match ups in order to depict the progress of the tournament (although this data could probably be derived at any time).
Games Table
I also modeled out the games of each Match Up. A game is identified by the match up it is a part of and a sequence number based on the order in which it took place during the match up. Games have a winner from the team table (FK2). Score could be recorded in this table as well.

4 tables:
Team(Team, Region, Seed)
User(UserId, Email, blablabla)
Bracket(BracketId, UserId, Points)
Pick(BracketId, GameId, Team, Points)
Each bracket a person submits will have 63 rows in the Pick table.
After each game is played you would update the pick table to score individual picks. Points field in this table will be null for game not yet played, 0 for an incorrect pick or positive number for correct pick. GameId is just a key identifying where in that users bracket this pick goes (ex: East_Round2_Game2, FinalFour_Game1).
The points column in the bracket table can be updated after each update of the pick table so it contains the sum of points for that bracket. The most looked at thing will be the standings, don't want to re-sum those every time someone wants to view the leader board.
You don't need to keep a table with all the games that actually get played or their results, just update the pick table after each game. You can even do the bracket highlighting of correct/incorrect picks by just looking at the Points column in the pick table.

In keeping track of a large number of different bracket predictions: You could use 67 bits for keeping track of the outcome of each game. (ie. Each of the sixty-seven games played in the tournament is each represented by a bit, 1 = "team A wins", 0 = "team B wins"). To display any given bracket, you can use a pretty simple function to map the 67 bits to the UI. The function knows the team names and their initial location, and then it tracks their movement through the bracket as it traces the 'bitboard'.

I use the same schema for all of my databases.
t
--------
1 guid PK
2 guid FK
3 bit
Then in my code:
select [2],[3] from t where [1] = #1
#1 is the id of the data I am fetching. Then if [2] is not null, I select again, setting #1 to [2].
This makes it really easy to model the situation you posted.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas