Completely noob to SQL.
I have created the following table, which stores data on matches between two opponents and the points the winner got.
CREATE TABLE matches ( winner INT references players,
loser INT references players,
gamepoints INT);
I created the below VIEW to show standings:
CREATE VIEW standings as
select
players.id,
players.name,
count(matches.winner) as number_of_wins,
coalesce(sum(matches.gamepoints),0) as points
from players left join matches
on players.id = matches.winner
group by players.name, players.id
order by number_of_wins desc, points desc;
I wish to add a column that will show how many games a player played. My problem is that games appear in both matches.winner and matches.loser columns, and I'm not sure how to aggregate them in the standings view.
Also, would you say that the matches table is normalized?
Any help would be greatly appreciated.
EDIT: changed matches content.
With the help of #Jorge Campos, this is the solution:
CREATE VIEW games_won as
select p.id, p.name, coalesce(sum(m.gamepoints),0) gp, count(m.winner) ng
from players p left join matches m
on p.id=m.winner
group by p.id, p.name;
CREATE VIEW games_lost as
select p.id, p.name, count(m.loser) as ng
from players p left join matches m
on p.id=m.loser
group by p.id, p.name;
CREATE VIEW standings as
select w.id, w.name, w.ng as wins, w.ng+l.ng as matches, w.gp as gamepoints
from games_won w INNER JOIN games_lost l
on w.id=l.id
order by wins desc, gamepoints;
For the simple case you show there are only a few things that you should fix to be ok. Again for the problem you show.
First: Change the columns types of the table matches it shouldn't be SERIAL as it is an autoincrement type column (not a real type). Both columns are foreign keys and it should be integer, int or bigint
as
create table matches (
winner bigint,
loser bigint,
gamepoints int,
constraint fk_player_winner foreign key (winner)
references players(id),
constraint fk_player_loser foreign key (loser)
references players(id)
);
Second: to know how many games a player did with the number of points you can create two subqueries one with the winners and one with the losers and join the two summing the values. The catch is that you have to decrease the gamepoints from the two:
select w.id, w.name, w.gp-l.gp as gamepoints, w.ng+l.ng
from (select p.id, p.name, sum(m.gamepoints) gp, count(m.winner) as ng
from players p inner join matches m
on p.id=m.winner
group by p.id, p.name ) w
INNER JOIN
(select p.id, p.name, sum(m.gamepoints) gp, count(m.loser) as ng
from players p inner join matches m
on p.id=m.loser
group by p.id, p.name) l on w.id=l.id;
From it you create your view.
Note: maybe I'm being overkill with this two subqueries. It is possible to work out with a join between two players tables and a matches
See how it goes here on fiddle: http://sqlfiddle.com/#!15/5b6a4/4
Related
Lets say i've got a db with 3 tables:
Players (PK id_player, name...),
Tournaments (PK id_tournament, name...),
Game (PK id_turn, FK id_tournament, FK id_player and score)
Players participate in tournaments. Table called Game keeps track of each player's score for different tournaments)
I want to create a view that looks like this:
torunament_name Winner highest_score
Tournament_1 Jones 300
Tournament_2 White 250
I tried different aproaches but I'm fairly new to sql (and alsoto this forum)
I tried using union all clause like:
select * from (select "Id_player", avg("score") as "Score" from
"Game" where "Id_tournament" = '1' group by "Id_player" order by
"Score" desc) where rownum <= 1
union all
select * from (select "Id_player", avg("score") as "Score" from
"Game" where "Id_tournament" = '2' group by "Id_player" order by
"Score" desc) where rownum <= 1;
and ofc it works but whenever a tournament happens, i would have to manually add a select statement to this with Id_torunament = nextvalue
EDIT:
So lets say that player with id 1 scored 50 points in tournament a, player 2 scored 40 points, player 1 wins, so the table should show only player 1 as the winner (or if its possible 2or more players if its a tie) of this tournament. Next row shows the winner of second tournament. I dont think Im going to put multiple games for one player in the same tournament, but if i would, it would probably count avg from all his scores.
EDIT2:
Create table scripts:
create table players
(id_player numeric(5) constraint pk_id_player primary key, name
varchar2(50));
create table tournaments
(id_tournament numeric(5) constraint pk_id_tournament primary key,
name varchar2(50));
create table game
(id_game numeric(5) constraint pk_game primary key, id_player
numeric(5) constraint fk_id_player references players(id_player),
id_tournament numeric(5) constraint fk_id_tournament references
tournaments(id_tournament), score numeric(3));
RDBM screenshot
FINAL EDIT:
Ok, in case anyone is wondering I used Jorge Campos script, changed it a bit and it works. Thank you all for helping. Unfortunately I cannot upvote comments yet, so I can only thank by posting. Heres the final script:
select
t.name,
p.name as winner,
g.score
from
game g inner join tournaments t
on g.id_tournament = t.id_tournament
inner join players p
on g.id_player = p.id_player
inner join
(select g.id_tournament, g.id_player,
row_number() over (partition by t.name order by
score desc) as rd from game g join tournaments t on
g.id_tournament = t.id_tournament
) a
on g.id_player = a.id_player
and g.id_tournament = a.id_tournament
and a.rd=1
order by t.name, g.score desc;
This query could be simplified depending on the RDBMs you are using.
select
t.name,
p.name as winner,
g.score
from
game g inner join tournaments t
on g.id_tournament = t.id_tournament
inner join players p
on g.id_player = p.id_player
inner join
(select id_tournament,
id_player,
row_number() over (partition by t.name order by score desc) as rd
from game
) a
on g.id_player = a.id_player
and g.id_tournament = a.id_tournament
and a.rd=1
order by t.name, g.score desc
Assuming what you want as "Display high score of each player in each tournament"
your query would be like below in MS Sql server
select
t.name as tournament_name,
p.name as Winner,
Max(g.score) as [Highest_Score]
from Tournmanents t
Inner join Game g on t.id_tournament=g.id_tournament
inner join Players p on p.id_player=g.id_player
group by
g.id_tournament,
g.id_player,
t.name,
p.name
Please check this if this works for you
SELECT tournemntData.id_tournament ,
tournemntData.name ,
dbo.Players.name ,
tournemntData.Score
FROM dbo.Game
INNER JOIN ( SELECT dbo.Tournaments.id_tournament ,
dbo.Tournaments.name ,
MAX(dbo.Game.score) AS Score
FROM dbo.Game
INNER JOIN dbo.Tournaments ONTournaments.id_tournament = Game.id_tournament
INNER JOIN dbo.Players ON Players.id_player = Game.id_player
GROUP BY dbo.Tournaments.id_tournament ,
dbo.Tournaments.name
) tournemntData ON tournemntData.id_tournament =Game.id_tournament
INNER JOIN dbo.Players ON Players.id_player = Game.id_player
WHERE tournemntData.Score = dbo.Game.score
I have this database schema:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name char(50) NOT NULL UNIQUE
);
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name char(50) NOT NULL,
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
uid INTEGER REFERENCES users (id) NOT NULL,
pid INTEGER REFERENCES products (id) NOT NULL,
quantity INTEGER NOT NULL,
price FLOAT NOT NULL CHECK (price >= 0)
);
I am trying to write a query that will give me all combinations of users and products, as well as the total amount spent by the user on that product. Specifically, if I have 5 products and 5 users, there should be 25 rows in the table. Right now I have a query that almost gets the job done, however, if the user has never purchased that product then there is no row printed at all.
Here's what I've written so far:
SELECT u.name as username, p.name as productname, SUM(o.quantity * o.price) as totalPrice
FROM users u, orders o, products p
WHERE u.id = o.uid
AND p.id = o.pid
GROUP BY u.name, p.name
ORDER BY u.name, p.name
I figure that this requires some sort of join, but my SQL knowledge is limited and I am not sure what would be the best way to go about doing this. I think if somebody can help me figure this out then I will have a much better understanding.
You can do this using cross join and left join:
select u.name as username, p.name as productname,
sum(o.quantity * o.price) as totalPrice
from users u cross join
products p left join
orders o
on o.uid = u.id and o.pid = p.id
group by u.name, p.name;
The cross join generates all the rows. The left join brings in the matching rows. A simple rule when using SQL is: Never use commas in the FROM clause. Always use explicit JOIN syntax.
My school task was to get names from my movie database actors which play in movies with highest ratings
I made it this way and it works :
select name,surname
from actor
where ACTORID in(
select actorid
from actor_movie
where MOVIEID in (
select movieid
from movie
where RATINGID in (
select ratingid
from rating
where PERCENT_CSFD = (
select max(percent_csfd)
from rating
)
)
)
);
the output is :
Gary Oldman
Sigourney Weaver
...but I'd like to also add to this select mentioned movie and its rating. It accessible in inner selects but I don't know how to join it with outer select in which i can work just with rows found in Actor Table.
Thank you for your answers.
You just need to join the tables properly. Afterwards you can simply add the columns you´d like to select. The final select could be looking like this.
select ac.name, ac.surname, -- go on selecting from the different tables
from actor ac
inner join actor_movie amo
on amo.actorid = ac.actorid
inner join movie mo
on amo.movieid = mo.movieid
inner join rating ra
on ra.ratingid = mo.ratingid
where ra.PERCENT_CSFD =
(select max(percent_csfd)
from rating)
A way to get your result with a slightly different method could be something like:
select *
from
(
select name, surname, percent_csfd, row_number() over ( order by percent_csfd desc) as rank
from actor
inner join actor_movie
using (actorId)
inner join movie
using (movieId)
inner join rating
using(ratingId)
(
where rank = 1
This uses row_number to evaluate the "rank" of the movie(s) and then filter for the movie(s) with the highest rating.
I have two entities in my database that are connected with a many to many relationship. I was wondering what would be the best way to list which entities have the most similarities based on it?
I tried doing a count(*) with intersect, but the query takes too long to run on every entry in my database (there are about 20k records). When running the query I wrote, CPU usage jumps to 100% and the database has locking issues.
Here is some code showing what I've tried:
My tables look something along these lines:
/* 20k records */
create table Movie(
Id INT PRIMARY KEY,
Title varchar(255)
);
/* 200-300 records */
create table Tags(
Id INT PRIMARY KEY,
Desc varchar(255)
);
/* 200,000-300,000 records */
create table TagMovies(
Movie_Id INT,
Tag_Id INT,
PRIMARY KEY (Movie_Id, Tag_Id),
FOREIGN KEY (Movie_Id) REFERENCES Movie(Id),
FOREIGN KEY (Tag_Id) REFERENCES Tags(Id),
);
(This works, but it is terribly slow)
This is the query that I wrote to try and list them:
Usually I also filter with top 1 & add a where clause to get a specific set of related data.
SELECT
bk.Id,
rh.Id
FROM
Movies bk
CROSS APPLY (
SELECT TOP 15
b.Id,
/* Tags Score */
(
SELECT COUNT(*) FROM (
SELECT x.Tag_Id FROM TagMovies x WHERE x.Movie_Id = bk.Id
INTERSECT
SELECT x.Tag_Id FROM TagMovies x WHERE x.Movie_Id = b.Id
) Q1
)
as Amount
FROM
Movies b
WHERE
b.Id <> bk.Id
ORDER BY Amount DESC
) rh
Explanation:
Movies have tags and the user can get try to find movies similar to the one that they selected based on other movies that have similar tags.
Hmm ... just an idea, but maybe I didnt understand ...
This query should return best matched movies by tags for a given movie ID:
SELECT m.id, m.title, GROUP_CONCAT(DISTINCT t.Descr SEPARATOR ', ') as tags, count(*) as matches
FROM stack.Movie m
LEFT JOIN stack.TagMovies tm ON m.Id = tm.Movie_Id
LEFT JOIN stack.Tags t ON tm.Tag_Id = t.Id
WHERE m.id != 1
AND tm.Tag_Id IN (SELECT Tag_Id FROM stack.TagMovies tm WHERE tm.Movie_Id = 1)
GROUP BY m.id
ORDER BY matches DESC
LIMIT 15;
EDIT:
I just realized that it's for M$ SQL ... but maybe something similar can be done...
You should probably decide on a naming convention and stick with it. Are tables singular or plural nouns? I don't want to get into that debate, but pick one or the other.
Without access to your database I don't know how this will perform. It's just off the top of my head. You could also limit this by the M.id value to find the best matches for a single movie, which I think would improve performance by quite a bit.
Also, TOP x should let you get the x closest matches.
SELECT
M.id,
M.title,
SM.id AS similar_movie_id,
SM.title AS similar_movie_title,
COUNT(*) AS matched_tags
FROM
Movie M
INNER JOIN TagsMovie TM1 ON TM1.movie_id = M.movie_id
INNER JOIN TagsMovie TM2 ON
TM2.tag_id = TM1.tag_id AND
TM2.movie_id <> TM1.movie_id
INNER JOIN Movie SM ON SM.movie_id = TM2.movie_id
GROUP BY
M.id,
M.title,
SM.id AS similar_movie_id,
SM.title AS similar_movie_title
ORDER BY
COUNT(*) DESC
I have 3 table as follow :
s(s# int,sname nchar(10))
p(p# int,pname nchar(10))
sp(s# int,p# int)
table "s" is table of suppliers and "s#" is primary key of it.also table "p" is table of products and "p#" is primary key on it."s#" and "p#" are foreign key in table "sp".
now my question is "How can I select name of suppliers from table "s" which producing all of products in table "p"...
SELECT p.*, s.sname FROM s, sp, p WHERE s.s# = sp.s# AND sp.p# = p.p#;
This statement will output all products with all their suppliers.
Now we group my suppliers, and count how many products they provide:
SELECT s.sname, count(*) FROM s, sp, p WHERE s.s# = sp.s# AND sp.p# = p.p# GROUP BY s.s#;
Now we know exacly, how many products each supplier provides. And we also know, how many products are in the productstable:
SELECT count(*) FROM p;
If you compare these values, you get your desired result:
SELECT amounts.name FROM
( SELECT s.sname AS name, count(*) AS offers
FROM s, sp, p
WHERE s.s# = sp.s# AND sp.p# = p.p#
GROUP BY s.s# ) amounts, -- this is a temp. tablename
( SELECT count(*) AS avaiable FROM p ) countTbl
WHERE amounts.offers = countTbl.avaiable;
Notice, that I didn't test the query. But you should get an idea on how to solve this problem.
It might also be possible to write this query more efficient, but this one can be understood easily.
There are two ways to do this, the first I thought of was to invert the logic.
Rather than attempting to find every entry of P let's just look for any that don't exist, then exclude those entries from S:
SELECT *
FROM S
WHERE
NOT EXISTS (
SELECT *
FROM P
LEFT JOIN SP
ON P.P# = SP.P#
AND SP.S# = S.S#
WHERE
SP.P# IS NULL
)