"Player has spaceships" database - sql

I'm pretty new to modeling databases, this is for a browser game.
Basically a player can spend resources to build spaceships. There are, let's say, 3 types of spaceships.
As I understand it's a 1-N relationship, but I'm really confused at how can I save the quantity of each type of spaceship from a specific player.
Right now I have a Player table, a Spaceship table, and Spaceship table contains 3 rows that represents the specific types of spaceships, with their own name, defense etc. Is that ok ?
I know that Spaceship will store Player's id as a foreign key, but I wonder if I just have to use COUNT function to display the quantity for each spaceship, or use an intermediate association like "Player-has-Spaceship" table with quantity attribute. The latter makes more sense to me.
Didn't try to code it blindly, I want a clear concept first.

CREATE TABLE counts (
player_id ...,
spaceship_id ...,
cnt INT UNSIGNED NOT NULL,
PRIMARY KEY(player_id, spaceship_id)
) ENGINE=InnoDB;
UPDATE counts SET
cnt = cnt + 1;
WHERE player_id = ?
AND spaceship_id = ?

Related

SQL schema Site Leader Board

So I am trying to set up a site which has challenges and then want to convert that to leader boards for each challenge, and then an all time leaderboard.
So I have a challenges table that looks like this:
Challenge ID Challenge Name Challenge Date Sport Prize Pool
Then I need a way so each challenge has its own leader board of say 50 people.
linked by the challenge ID where that will = Leaderboard ID
I have a leader board of 50 people for that challenge that will look something like this:
Challenge ID User Place Prize Won
My question is 2 things:
How can I make a table auto create when a new challenge is added to the challenges table?
How can I get an A site wide leader board for every challenge so it will show the following:
Rank USER Prize Money Won(total every challenge placed)
and then base rank order by how much money won..
I know this is a lot of questions all wrapped in one, schema design and logic.
Any insights greatly appreciated
A better approach than one table per challenge is one table for all of them. That way you can compute grand totals and individual challenge rankings all with the same table. You'd also want to not record the place directly but compute it on the fly with the appropriate window function depending on how you want to handle ties (rank(), dense_rank(), and row_number() will have different results in those cases); that way you don't have to keep adjusting it as you add new records.
A table something like (You didn't specify a SQL database, so I'm going to assume Sqlite. Adjust as needed.):
CREATE TABLE challenge_scores(user_id INTEGER REFERENCES users(id),
challenge_id INTEGER REFERENCES challenges(id),
prize_amount NUMERIC,
PRIMARY KEY(user_id, challenge_id));
will let you do things like
SELECT *
FROM (SELECT user_id,
sum(prize_amount) AS total,
rank() OVER (ORDER BY sum(prize_amount) DESC) AS place
FROM challenge_scores
GROUP BY user_id)
WHERE place <= 50
ORDER BY place;
for the global leaderboard, or the similar:
SELECT *
FROM (SELECT user_id,
prize_amount,
rank() OVER (ORDER BY prize_amount DESC) AS place
FROM challenge_scores
WHERE challenge_id = :some_challenge_id
GROUP BY user_id)
WHERE place <= 50
ORDER BY place;
for a specific challenge's.

Writing SQL query to find ranking

I'm trying to determine for a given person how many people have a better score than they do, and group it by the different teams they belong to. So, in the tables below, I'm grabbing the list of team_id from the team_person table where the person_id matches the person I care about. That will get me all of the teams I belong to.
Then I need to know each person_id that is in any team I belong to so that I can find out what their maximum score is from the performances table.
Once I have that, I finally want to determine, for each team_id, how many people on that team have a better score than I do, where better is simply defined as having a larger value.
I've gotten way beyond my abilities with SQL at this point. What I have so far, which seems to get me the maximum score for all the people I care about, (basically everything but my final "by team" requirement) is this:
SELECT person_id, MAX(score) m
FROM performances
WHERE category_id = 7 AND person_id IN (
-- Find all the people on the teams I belong to
SELECT DISTINCT person_id
FROM team_person
WHERE team_id IN (
-- Find all the teams that I belong to
SELECT DISTINCT team_id
FROM team_person
WHERE person_id = 2
)
)
GROUP BY person_id
ORDER BY 2 DESC
My two relevant tables are defined like so, and I'm using psql 9.1.15
Table "public.team_person"
Column | Type | Modifiers
------------+--------------------------+-------------------------------------------------------------
ident | integer | not null default nextval('team_person_ident_seq'::regclass)
team_id | integer | not null
person_id | integer | not null
*chop extraneous columns*
Indexes:
"team_person_pkey" PRIMARY KEY, btree (ident)
"teamPersonUnique" UNIQUE CONSTRAINT, btree (team_id, person_id)
Foreign-key constraints:
"team_person_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(ident) ON DELETE CASCADE
"team_person_team_id_fkey" FOREIGN KEY (team_id) REFERENCES team(ident) ON DELETE CASCADE
Referenced by:
TABLE "roster" CONSTRAINT "roster_team_person_id_fkey" FOREIGN KEY (team_person_id) REFERENCES team_person(ident) ON DELETE SET NULL
Triggers:
update_team_person_modified BEFORE INSERT OR UPDATE ON team_person FOR EACH ROW EXECUTE PROCEDURE update_modified_column()
Table "public.performances"
Column | Type | Modifiers
-------------+--------------------------+--------------------------------------------------------------
ident | bigint | not null default nextval('performances_ident_seq'::regclass)
category_id | integer | not null
person_id | integer | not null
score | real | not null
*chop extraneous columns*
Indexes:
"performances_pkey" PRIMARY KEY, btree (ident)
Foreign-key constraints:
"performances_category_id_fkey" FOREIGN KEY (category_id) REFERENCES performance_categories(ident) ON DELETE CASCADE
"performances_person_id_fkey" FOREIGN KEY (person_id) REFERENCES person(ident) ON DELETE CASCADE
First, state just the problem, without assumptions about how to get to the solution. You've done that fairly well:
determine for a given person how many people have a better score than they do, and group it by the different teams they belong to.
but I'd rephrase a bit:
For each team a given person is a member of, how many people in that team have a better score than the subject person?
I don't know about you, but it suddenly seems simpler now. Take the team table, left outer join team_person and filter for teams we're a member of, left outer join performances to find games we played with that team, left outer join team_person again to get other people who're members of each team, left outer join performances, filter out teams the subject person isn't a member of, group and aggregate.
It's underspecified for some corner cases (like a team where you're the only member, or a team where you didn't play a game), but eh, whatever.
Problems:
There's no team table. Since you don't care about anything in the team table, you can omit it from the join and just use team_person as the join root.
Your team_person table is defective, by the way. It should have a UNIQUE constraint on (team_id, person_id). Or, better, that should be the primary key. It doesn't actually matter for this query because duplicate team memberships won't change the result, but it's bad data modelling. You can't be a member of a team more than once.
performances should also have a column identifying the particular game or whatever. Since you haven't shown one, I'm going to assume you mean that you're looking for people who, in any game, performed better than the subject person at least once, in that game or another game. If you actually want to find people who did better in a particular game then you need a suitable key on performances.
Fatal problem: performances is also missing a column linking the performance to the team. This makes it impossible to properly solve the problem because you can't get performances by a given person on a given team. I'm going to assume there is in fact a team_id on performances and you just left it out.
So, allowing for the above issues, I'd first acquire the data with a big join, then group and aggregate it. This join will give us, for each team we played in, for each of our performances, for each other player, for each of their other performances, one row with all the relevant information. You can then compare performances and aggregate.
The below is totally untested, since you didn't provide sample data and you chopped important parts out of your schema (or the schema is defective), but I'd try something like:
SELECT
my_performances.team_id,
-- Find how many distinct people scored better than us at least once,
-- no matter how many times or in which game.
COUNT(distinct other_team_person.person_id)
-- Start the join with our team memberships and how we scored in each.
-- If we didn't play any games for this team don't produce a result row
-- for it, so INNER JOIN.
FROM team_person my_team_person
INNER JOIN performances my_performances ON
(my_performances.person_id = my_team_person.person_id
AND my_performances.team_id = my_team_person.team_id)
-- Other members of teams we're also a member of, skipping
-- ourselves. An `INNER JOIN` is fine here because we know
-- a team with only ourselves as a member isn't interesting
-- and we might as well skip it.
INNER JOIN team_person others_team_person ON (
my_team_person.team_id = other_team_person.team_id
AND my_team_person.person_id <> other_team_person.person_id)
-- How each of those people performed in each team they're in
-- (because of previous filter, only considers teams we're in too).
-- INNER JOIN because if they never played they can't beat us.
INNER JOIN performances other_performances ON (
other_team_person.person_id = other_performances.person_id
AND other_team_person.team_id = other_performances.team_id)
-- Make sure `my_team_person` is only teams we're a member of
WHERE my_team_person.person_id = $1
-- Also discard rows where the other person didn't do better than us
AND my_performances.score < other_performances.score
-- Emit one row per team we're a member of
GROUP BY my_performances.team_id;
If you want to show teams where you never played and teams where you're the only player, you'll need to change some INNER JOINs to LEFT OUTER JOINs.
If you want to compare to find people who beat you only within a given game, you're going to need an extra column on performances, then an extra term in the join on other_performances to restrict it to only matching in the same game as my_performances.

Schema Normalization :: Composite Game Schedule Constrained by Team

Related to the original generalized version of the problem:http://stackoverflow.com/questions/6068635/database-design-normalization-in-2-participant-event-join-table-or-2-column
As you'll see in the above thread, a game (event) is defined as exactly 2 teams (participants) playing each other on a given date (no teams play each other more than once in a day).
In our case we decided to go with a single composite schedule table with gameID PK, 2 columns for the teams (call them team1 & team2) and game date, time & location columns. Additionally, since two teams + date must be unique, we define a unique key on these combined fields. Separately we have a teams table with teamID PK related to schedule table columns team1 & team2 via FK.
This model works fine for us, but what I did not post in above thread is the relationship between scheduled games and results, as well as handling each team's "version" of the scheduled game (i.e. any notes team1 or team2 want to include, like, "this is a scrimmage against a non-divisional opponent and will not count in the league standings").
Our current table model is:
Teams > Composite Schedule > Results > Stats (tables for scoring & defense)
Teams > Players
Teams > Team Schedule*
*hack to handle notes issue and allow for TBD/TBA games where opponent, date, and/or location may not be known at time of schedule submission.
I can't help but think we can consolidate this model. For example, is there really a need for a separate results table? Couldn't the composite schedule be BOTH the schedule and the game result? This is where a join table could come into play.
Join table would effectively be a gameID generator consisting of:
gameID (PK)
gameDate
gameTime
location
Then revised composite schedule/results would be:
id (PK)
teamID (FK to teams table)
gameID (FK to join table)
gameType (scrimmage, tournament, playoff)
score (i.e. number of goals)
penalties
powerplays
outcome (win-loss-tie)
notes (team's version of the game)
Thoughts appreciated, has been tricky trying to drilldown to the central issue (thus original question above)
I don't see any reason to have separate tables for the schedule and results. However, I would move "gameType" to the Games table, otherwise you're storing the same value twice. I'd also consider adding the teamIDs to the Games table. This will serve two purposes: it will allow you to easily distinguish between home and away teams and it will make writing a query that returns both teams' data on the same row significantly easier.
Games
gameID (PK)
gameDate
gameTime
homeTeamID
awayTeamID
location
gameType (scrimmage, tournament, playoff)
Sides
id (PK)
TeamID (FK to teams table)
gameID (FK to games table)
score
penalties
powerplays
notes
As shown, I would also leave out the "Outcome" field. That can be effectively and efficiently derived from the "Score" columns.

Tips on Database schema

I have a database that tracks UK Horse races.
Race contains all the information for a particular race.
CREATE TABLE "race" (
"id" INTEGER PRIMARY KEY AUTOINCREMENT,
"date" TEXT NOT NULL,
"time" TEXT NOT NULL,
"name" TEXT NOT NULL,
"class" INTEGER NOT NULL,
"distance" INTEGER NOT NULL,
"extra" TEXT NOT NULL,
"going" TEXT NOT NULL,
"handicap" INTEGER NOT NULL,
"prize" REAL,
"purse" REAL,
"surface" TEXT NOT NULL,
"type" TEXT NOT NULL,
"course_id" INTEGER NOT NULL,
"betfair_path" TEXT NOT NULL UNIQUE,
"racingpost_id" INTEGER NOT NULL UNIQUE,
UNIQUE("betfair_path", "racingpost_id")
);
A race can have many entries.
CREATE TABLE "entry" (
"id" INTEGER PRIMARY KEY AUTOINCREMENT,
"weight" INTEGER,
"allowance" INTEGER,
"horse_id" INTEGER NOT NULL,
"jockey_id" INTEGER,
"trainer_id" INTEGER,
"race_id" INTEGER NOT NULL,
UNIQUE("race_id", "horse_id")
);
An entry can have 0 or 1 runner. This takes into account non-runners, horses entered for a race but who failed to start.
CREATE TABLE "runner" (
"id" INTEGER PRIMARY KEY AUTOINCREMENT,
"position" TEXT NOT NULL,
"beaten" INTEGER,
"isp" REAL NOT NULL,
"bsp" REAL,
"place" REAL,
"over_weight" INTEGER,
"entry_id" INTEGER NOT NULL UNIQUE
);
My question is
Is that actually the best way to store my Entry vs Runner data? Note: Entry data is always harvested in a single sweep, and runner (basically result) is found later.
What query would I need to quickly find total entries vs. total runners for a particular race.
How can I easily match the runner information with entry information without multiple selects?
Apologies if I am missing something obvious but I am now brain dead from coding this application.
Your schema looks reasonable. The key construct to use to address your SQL questions is LEFT JOIN, for example:
SELECT COUNT(entry.id) entry_count, COUNT(runner.id) runner_count
FROM entry
LEFT JOIN runner ON runner.entry_id = entry.id
WHERE race_id = 1
From Wikipedia:
... a left outer join returns all the values from the left table, plus matched values from the right table (or NULL in case of no matching join predicate).
So in general for your schema, focus on the entry table and LEFT JOIN the runner table as needed.
Relational database tag, and you want advice on your schema as per title. Even though the single question is answered, you may have more tomorrow.
I couldn't make any sense of your three flat files, so I drew them up into what they might look like in a â–¶Relational databaseâ—€, where the information is organised and queries are easy. Going brain dead is not unusual when the information remains in its complex form.
If you have not seen the Relational Modelling Standard, you might need the IDEF1X Notation.
Note, OwnerId, JockeyId, and TrainerId are all PersonIds. No use manufacturing new ones when there is a perfectly good unique one already sitting there in the table. Just rename it to reflect its Role, and the PK of the table that it is in (the relevance of this will become clear when you code).
MultipleSELECTSare nothing to be scared of, SQL is a cumbersome language but that is all we have. The problem is:
the complexity (necessary due to a bad model) of eachSELECT
and whether you learn and understand how to use subqueries or not.
Single level queries are obviously very limited, and will lead to procedural (row-by-row) processing instead of set-processing.
Single level queries result in huge result sets that then have to be beaten into submission using GROUP BY, etc. Not good for performance, churning through all that unwanted data; better to get only the data you really want.
Now the queries.
When you are printing race forms, I think you will need the Position scheduled and advertised for the RaceEntry; it is not an element of a Runner.
Now that we have gotten rid of those Ids all over the place, which force all sorts of unnecessary joins, we can join directly to the parents concerned (less joins). Eg. for the Race Form, which is only concerned with RaceEntry, for the Owner, you can join to directly to Person using WHERE OwnerId = Person.PersonId; no need to join HorseRegistered or Owner.
LEFT and RIGHT joins are OUTER joins, which means the rows on one side may be missing. That method has been answered, and you will get Nulls, which you have to process later (more code and cycles). I do not think that is what you want, if you are filling forms or a web page.
The concept here is to think is terms of Relational sets, not row-by-row processing. But you need a database for that. Now that we have a bit of Relational power in the beast, you can try this for the Race Result (not the Race Form), instead of procedural processing. These are Scalar Subqueries. For the passed Race Identifiers (the outer query is only concerned with a Race): SELECT (SELECT ISNULL(Place, " ")
FROM Runner
WHERE RacecourseCode = RE.RacecourseCode
AND RaceDate = RE.RaceDate
AND RaceNo = RE.RaceNo
AND HorseId = RE.HorseId) AS Finish,
(SELECT ISNULL(Name, "SCRATCH")
FROM Runner R,
Horse H
WHERE R.RacecourseCode = RE.RacecourseCode
AND R.RaceDate = RE.RaceDate
AND R.RaceNo = RE.RaceNo
AND R.HorseId = RE.HorseId
AND H.HorseId = RE.HorseId) AS Horse,
-- Details,
(SELECT Name FROM Person WHERE PersonId = RE.TrainerId) AS Trainer,
(SELECT Name FROM Person WHERE PersonId = RE.JockeyId) AS Jockey,
ISP AS SP,
Weight AS Wt
FROM RaceEntry RE
WHERE RaceDate = #RaceDate
AND RacecourseCode = #RacecourseCode -- to print entire race form,
AND RaceNo = #RaceNo -- remove these 2 lines
ORDER BY Position
This matches entries and runners for a given race
SELECT E.*, R.*
FROM entry E LEFT JOIN runner R on R.entry_id = E.id
WHERE E.race_id = X
If the entry has no runner, then the R.* fields are all null. You can count such null fields to answer your first query (or perhaps more easily, subtract)

How to make this sub-sub-query work?

I am trying to do this in one query. I asked a similar question a few days ago but my personal requirements have changed.
I have a game type website where users can attend "classes". There are three tables in my DB.
I am using MySQL. I have four tables:
hl_classes (int id, int professor,
varchar class, text description)
hl_classes_lessons (int id, int
class_id, varchar
lessonTitle, varchar lexiconLink,
text lessonData)
hl_classes_answers
(int id, int lesson_id, int student,
text submit_answer, int percent)
hl_classes stores all of the classes on the website.
The lessons are the individual lessons for each class. A class can have infinite lessons. Each lesson is available in a specific term.
hl_classes_terms stores a list of all the terms and the current term has the field active = '1'.
When a user submits their answers to a lesson it is stored in hl_classes_answers. A user can only answer each lesson once. Lessons have to be answered sequentially. All users attend all "classes".
What I am trying to do is grab the next lesson for each user to do in each class. When the users start they are in term 1. When they complete all 10 lessons in each class they move on to term 2. When they finish lesson 20 for each class they move on to term 3. Let's say we know the term the user is in by the PHP variable $term.
So this is my query I am currently trying to massage out but it doesn't work. Specifically because of the hC.id is unknown in the WHERE clause
SELECT hC.id, hC.class, (SELECT MIN(output.id) as nextLessonID
FROM ( SELECT id, class_id
FROM hl_classes_lessons hL
WHERE hL.class_id = hC.id
ORDER BY hL.id
LIMIT $term,10 ) as output
WHERE output.id NOT IN (SELECT lesson_id FROM hl_classes_answers WHERE student = $USER_ID)) as nextLessonID
FROM hl_classes hC
My logic behind this query is first to For each class; select all of the lessons in the term the current user is in. From this sort out the lessons the user has already done and grab the MINIMUM id of the lessons yet to be done. This will be the lesson the user has to do.
I hope I have made my question clear enough.
My assumption is that you want a query that will tell what the next lesson is for a particular student for a given term, or null if there are no further classes for that student in that term. The result should be one row or null.
In order to do that with any efficiency (and IMHO, sanity) you need to revisit your table structure and assumptions about your data first. I am assuming from the table structures that you provided and how you described the lesson numbers, that there would be, for example, class 1, lessons 1, 2, 3, ..., 10, 11, ... 20, 21, ..., 30, and then class 2, lessons 1...30, and then class 3, lessons 1...30, etc. Further, lessons 1-10 for each class correspond to term 1, 11-20, to term 2, and 21-30 to term 3. Finally, terms are completed in order--class 3 lesson 10 is completed before class 1 lesson 11.
First, rather than using your class number as both a unique identifier and and ordering number (class 1 happens before class 2, etc), I would suggest a unique id field (probably an auto-increment), and a separate class_num field for the ordering number. (This is less critical for the classes table, than it is for the lessons table, described next.)
Next, and similarly, lessons should get a unique id field separate from it's lesson number field. The id would be the PK. This unique id is necessary to greatly simplify the query you want, as well as any other queries you might need. Without it you are dealing with a two-field composite key that makes many joins and subqueries nightmarishly complicated. You would probably want an additional unique index on class_id and lesson_num so that a lesson number is not re-used for a class. Also, this table should contain the term_num (or term_id) that a particular lesson for a particular class is assigned to. This will keep you from having to calculate what term a lesson is in using an overcomplicated MOD formula. That would be overkill. Just store the term number with the lesson information, and you can organize terms however you want.
Next, the answers table's id field should be a unique auto-increment. If it is important, you might also want a unique index on lesson_id and student_id (although this means either no retakes, or retake overwrites).
So I now have:
hl_classes (int id, int class_num, professor, class_name, description) PK: id, autoinc
hl_classes_lessons (int id, int class_id, int lesson_num, int term_num, l_title, l_link, l_data) PK: id, autoinc; Unique Key: class_id, lesson_num
hl_classes_answers (int id, int lesson_id, int student, ans, pct) PK: id, autoinc; Unique Key: lesson_id, student
With that, I came up with:
select hC.id as next_class_id, hL.id as next_lesson_id, hC.class as next_class, hL.term_num as term_num, hC.class_num as next_class_num, hL.lesson_num AS next_lesson_num
from hl_classes hC
left join hl_classes_lessons hL on hL.class_id = hC.id
where hL.term_num = $TERM_NUM
and hL.id not in (
select hA.lesson_id
from hl_classes_answers hA
where student = $USER_ID
)
order by hC.class_num, hL.lesson_num
limit 1;
This will give you back either one row containing the relevant information about the next class for that student, given that term, or all nulls. Note that the ids are not for display, as they could be any ol' number. You would display the _num fields.
I am not sure what you want the end result to look like, and why you have the LIMIT $term in your query, but if you want to get all the classes and the next lesson (if available) for the user you can use this:
SELECT c.*, l.*
FROM hl_classes c JOIN (
SELECT l.class_id, MIN(l.id) NextLessonID
FROM hl_classes_lessons l LEFT JOIN (
SELECT sca.class_id, MAX(sca.lesson_id) MaxID
FROM hl_classes_answers sca
WHERE sca.student = $USER_ID
GROUP BY sca.class_id
) cm ON (l.class_id = cm.class_id AND l.id > cm.MaxID) OR cm.class_id IS NULL
GROUP BY l.class_id
) nid ON c.id = nid.class_id
JOIN hl_classes_lessons l ON c.id = l.class_id AND l.id = nid.NextLessonID