Hackerrank Winners chart SQL advance level question - sql

This is a new question Hackerrank has added for the advance level SQL certification. I was not able to solve it at all. Can anyone help?
There were a number of contests where participants each made number of attempts. The attempt with the highest score is only one considered. Write a query to list the contestants ranked in the top 3 for each contest. If multiple contestants have the same score in each contest, they are at the same rank.
Report event_id, rank 1 name(s), rank 2 name(s), rank 3 name(s). Order the contest by event_id. Name that share a rank should be ordered alphabetically and separated by a comma.
Order the report by event_id

The following is some sample data for your scenario. A table of contestants, and the attempts made. Made each person's attempts on their own line so you can see obvious different attempts per person.
create table contestants
( id int identity(1,1) not null,
personName nvarchar(10) )
insert into contestants ( personName )
values ( 'Bill' ), ('Mary'), ('Jane' ), ('Mark')
create table attempts
( id int identity(1,1) not null,
contestantid int not null,
score int not null )
insert into attempts ( contestantid, score )
values
( 1, 72 ), ( 1, 88 ), (1, 81 ),
( 2, 83 ), ( 2, 88 ), (2, 79), (2,86),
( 3, 94 ),
( 4, 79 ), (4, 87)
Now, the simple premise is each contestants best score which is a simple MAX() of the score per contestant.
select
contestantid,
max( score ) highestScore
from
attempts
group by
contestantid
The result of the above query is the BASIS of the final ranking. So I have put that query as the FROM source. So instead of a table, the from is the result of the above query which I have aliased "PreAgg" for the pre-aggregation per contestant.
select
ContestantID,
c.personName,
DENSE_RANK() OVER ( order by HighestScore DESC ) as FinalRank
from
(select
contestantid,
max( score ) highestScore
from
attempts
group by
contestantid ) preAgg
JOIN Contestants c
on preAgg.contestantid = c.id
The join to the contestant is easy enough to pull the name, but now look at the DENSE_RANK() clause. Since there is no grouping per score, such as say the Olympics where there is a specific sport, and each sport has highest ranks, we do not need the "PARTITION" clause.
The ORDER BY clause is what you want. In this case, the HighestScore column from the pre-aggregation query and want that in DESCENDING order so the HIGHEST score is at the top and going down. The "as" gives it the final column name.
DENSE_RANK() OVER ( order by HighestScore DESC ) as FinalRank
Results
ContestantID personName FinalRank
3 Jane 1
1 Bill 2
2 Mary 2
4 Mark 3
Now, if you only wanted a limit such as the top 3 ranks and you actually had 20+ competitors, just wrap this up one more time where a where clase
select * from
(
select
ContestantID,
c.personName,
DENSE_RANK() OVER ( order by HighestScore DESC ) as FinalRank
from
(select
contestantid,
max( score ) highestScore
from
attempts
group by
contestantid ) preAgg
JOIN Contestants c
on preAgg.contestantid = c.id ) dr
where
dr.FinalRank < 3

MY Reference table
MYSQL SOLUTION
using multiple common table expressions , dense_rank , Join and group_concat
WITH t1 AS
(SELECT *,DENSE_RANK() OVER(PARTITION BY event_id ORDER BY score DESC) AS 'rk' FROM Scoretable),
t2 AS
(SELECT * FROM t1 WHERE rk<=3),
t3 AS
(SELECT event_id , CASE WHEN rk=1 THEN p_Name ELSE NULL END AS 'first' FROM t2 WHERE rk=1 ),
t4 AS
(SELECT event_id , CASE WHEN rk=2 THEN p_Name ELSE NULL END AS 'second' FROM t2 WHERE rk=2 ),
t5 AS
(SELECT event_id , CASE WHEN rk=3 THEN p_Name ELSE NULL END AS 'third' FROM t2 WHERE rk=3 ),
t6 AS
(SELECT t3.event_id , t3.first , t4.second , t5.third FROM t3 JOIN t4 ON t3.event_id = t4.event_id JOIN t5 ON t4.event_id=t5.event_id ORDER BY 1,2,3,4)
SELECT event_id , GROUP_CONCAT(DISTINCT first) AS 'rank 1' , GROUP_CONCAT(DISTINCT second) AS 'rank 2' , GROUP_CONCAT(DISTINCT third) AS 'rank 3'
FROM t6 GROUP BY 1 ORDER BY 1;

Related

How do I get groups with the same number of students in SQL?

I have two tables:
TABLE students
(
STUDENT_ID smallint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
GROUP_ID smallint,
);
TABLE groups
(
GROUP_ID smallint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
GROUP_NAME char(5)
);
Each group has a certain number of students.
To find out, I use this query:
SELECT
groups.group_name, COUNT (*)
FROM
students
JOIN
groups ON students.group_id = groups.group_id
GROUP BY
groups.group_name;
Output:
group_name
COUNT(*)
UH-76
27
LQ-99
16
UD-65
16
MQ-93
23
OC-92
23
PF-42
22
KZ-57
21
NR-64
28
WY-31
19
TX-59
17
Now the question:
How do I get groups with the same number of students?
group_name
COUNT(*)
"LQ-99"
16
"UD-65"
16
"MQ-93"
23
"OC-92"
23
How do I get the groups with the least number of students?
group_name
COUNT(*)
LQ-99
16
UD-65
16
In order to get groups with the same number of students, you can assign a ranking, grouping on your counts_, then picking those which have at least a rank 2 (which means that at least one count_ is repeated twice).
WITH tab AS (
SELECT groups.group_name,
COUNT(*) AS count_
FROM students
INNER JOIN groups
ON students.group_id = groups.group_id
GROUP BY groups.group_name
), cte AS (
SELECT group_name,
count_,
ROW_NUMBER() OVER(
PARTITION BY count_
ORDER BY group_name
) AS rn
FROM tab
)
SELECT *
FROM tab
WHERE count_ IN (SELECT count_ FROM cte WHERE rn = 2);
Instead if you want to get the least number of students, it suffices getting all counts which value count is minimum, using a subquery:
WITH tab AS (
SELECT groups.group_name,
COUNT(*) AS count_
FROM students
INNER JOIN groups
ON students.group_id = groups.group_id
GROUP BY groups.group_name
)
SELECT *
FROM tab
WHERE count_ IN (SELECT MIN(count_) FROM tab);
Check the demo here.
Note that your initial query is employed within a common table expression inside these solutions.

Could this query be optimized?

My goal is to select record by two criterias that depend on each other and group it by other criteria.
I found solution that select record by single criteria and group it
SELECT *
FROM "records"
NATURAL JOIN (
SELECT "group", min("priority1") AS "priority1"
FROM "records"
GROUP BY "group") AS "grouped"
I think I understand concept of this searching - select properties you care about and match them in original table - but when I use this concept with two priorities I get this monster
SELECT *
FROM "records"
NATURAL JOIN (
SELECT *
FROM (
SELECT "group", "priority1", min("priority2") AS "priority2"
FROM "records"
GROUP BY "group", "priority1") AS "grouped2"
NATURAL JOIN (
SELECT "group", min("priority1") AS "priority1"
FROM "records"
NATURAL JOIN (
SELECT "group", "priority1", min("priority2") AS "priority2"
FROM "records"
GROUP BY "group", "priority1") AS "grouped2'"
GROUP BY "group") AS "GroupNested") AS "grouped1"
All I am asking is couldn't it be written better (optimalized and looking-better)?
JSFIDDLE
---- Update ----
The goal is that I want select single id for each group by priority1 and priority2 should be selected as first and then priority2).
Example:
When I have table records with id, group, priority1 and priority2
with data:
id , group , priority1 , priority2
56 , 1 , 1 , 2
34 , 1 , 1 , 3
78 , 1 , 3 , 1
the result should be 56,1,1,2. For each group search first for min of priority1 than search for min of priority2.
I tried combine max and min together (in one query`, but it does not find anything (I do not have this query anymore).
EXISTS() to the rescue! (I did some renaming to avoid reserved words)
SELECT *
FROM zrecords r
WHERE NOT EXISTS (
SELECT *
FROM zrecords nx
WHERE nx.zgroup = r.zgroup
AND ( nx.priority1 < r.priority1
OR nx.priority1 = r.priority1 AND nx.priority2 < r.priority2
)
);
Or, to avoid the AND / OR logic, compare the two-tuples directly:
SELECT *
FROM zrecords r
WHERE NOT EXISTS (
SELECT *
FROM zrecords nx
WHERE nx.zgroup = r.zgroup
AND (nx.priority1, nx.priority2) < (r.priority1 , r.priority2)
);
maybe this is what you expect
with dat as (
SELECT "group" grp
, priority1, priority2, id
, row_number() over (partition by "group" order by priority1) +
row_number() over (partition by "group" order by priority2) as lp
FROM "records")
select dt.grp, priority1, priority2, dt.id
from dat dt
join (select min(lp) lpmin, grp from dat group by grp) dt1 on (dt1.lpmin = dt.lp and dt1.grp =dt.grp)
Simply use row_number() . . . once:
select r.*
from (select r.*,
row_number() over (partition by "group" order by priority1, priority2) as seqnum
from records r
) r
where seqnum = 1;
Note: I would advise you to avoid natural join. You can use using instead (if you don't want to explicitly include equality comparisons).
Queries with natural join are very hard to debug, because the join keys are not listed. Worse, "natural" joins do not use properly declared foreign key relationships. They depend simply on columns that have the same name.
In tables that I design, they would never be useful anyway, because almost all tables have createdAt and createdBy columns.

Get the player who attended highest number of International or National Game matches?

SELECT *
From (
SELECT Count(Player_Score_Record.Player_ID) AS TotalC,
Player.First_Name
FROM Player
INNER JOIN (Match INNER JOIN Player_Score_Record
ON Match.Match_ID = Player_Score_Record.Match_ID)
ON Player.Player_ID = Player_Score_Record.Player_ID
WHERE (Match.M_Type='International' Or Match.M_Type='National')
GROUP BY Player.First_Name
ORDER BY Count(Player_Score_Record.Player_ID) DESC
) A
Where A.TotalC = 3 ;
I had tried Max(Count(Player_Score_Record.Player_ID) As TotalC – it also shows an error.
The latest try was to use WHERE A.TotalC = (SELECT Max(TotalC) From A);. It show
ORA-00942: table or view does not exist.
So I can only directly assign the number of 3 (max number of matches joined).
I am not entirely sure whether you just want:
a list of players ordered by number of played matches.
a list of players that have played as much matches as the player who has played more matches.
Anyway, you can try either of the following solutions. Please take in account these SQL sentences are untested as I have currently no access to a database.
If it is option 1 you could use following SQL:
select Player.first_name, player_matches.player_id, player_matches.total_played
from (
select Player_ID, count(Match_ID) TotalPlayed
from Player_Score_Record
join Match
on Match.Match_ID = Player_Score_Record.Match_ID
where (Match.M_Type='International' Or Match.M_Type='National')
group by Player_ID
) player_matches
join Player on player_matches.player_id = player.player_id
order by player_matches.total_played desc
What we do in this select is:
First we get a list of Player_ID and the number of matches that player has played.
Then we join it with Player to get any further data we want.
Finally we order it. You might want to filter by total_played here.
If, on the other hand, it is option 2 that you want try following SQL:
select Player.first_name, Player_ID, count(Match_ID) TotalPlayed
from Player_Score_Record
join Player
on Player_Score_Record.player_id = player.player_id
group by Player_ID
having count(Match_ID) = (
select max(total_played)
from (
select Player_ID, count(Match_ID) TotalPlayed
from Player_Score_Record
join Match
on Match.Match_ID = Player_Score_Record.Match_ID
where (Match.M_Type='International' Or Match.M_Type='National')
group by Player_ID
)
What we do in this select is:
First we get a list of Player_ID and the number of matches that player has played.
Then we extract the maximum number of played matches by any player.
Finally we get the player information of any player that has played that maximum number of matches.
Two options here:
Query 1: if you want all the Player_IDs of players with the highest number of games played; or
Query 2: if you only want a single player with the highest number of games played.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE Player
(
Player_ID PRIMARY KEY,
First_Name
) AS
SELECT 1, 'A' FROM DUAL
UNION ALL SELECT 2, 'A' FROM DUAL
UNION ALL SELECT 3, 'A' FROM DUAL;
CREATE TABLE Match
(
Match_ID PRIMARY KEY,
M_Type
) AS
SELECT 1, 'National' FROM DUAL
UNION ALL SELECT 2, 'International' FROM DUAL
UNION ALL SELECT 3, 'League' FROM DUAL;
CREATE TABLE Player_Score_Record
(
Player_ID NUMBER(1),
Match_ID NUMBER(1),
PRIMARY KEY (Player_ID, Match_ID),
FOREIGN KEY (Player_ID) REFERENCES Player (Player_ID),
FOREIGN KEY (Match_ID) REFERENCES Match (Match_ID)
);
INSERT INTO Player_Score_Record
SELECT 1,1 FROM DUAL
UNION ALL SELECT 1,2 FROM DUAL
UNION ALL SELECT 2,1 FROM DUAL
UNION ALL SELECT 2,2 FROM DUAL
UNION ALL SELECT 2,3 FROM DUAL
UNION ALL SELECT 3,2 FROM DUAL
UNION ALL SELECT 3,3 FROM DUAL;
Query 1:
WITH num_games AS (
SELECT r.Player_ID, COUNT(1) AS number_of_games_played
FROM Player_Score_Record r
INNER JOIN
Match m
ON (r.Match_ID = m.Match_ID)
WHERE m.M_Type IN ( 'National', 'International' )
GROUP BY r.Player_ID
)
SELECT Player_ID
FROM num_games
WHERE number_of_games_played = (SELECT MAX(number_of_games_played)
FROM num_games)
Results:
| PLAYER_ID |
|-----------|
| 1 |
| 2 |
Query 2:
WITH num_games AS (
SELECT r.Player_ID, COUNT(1) AS number_of_games_played
FROM Player_Score_Record r
INNER JOIN
Match m
ON (r.Match_ID = m.Match_ID)
WHERE m.M_Type IN ( 'National', 'International' )
GROUP BY r.Player_ID
ORDER BY number_of_games_played DESC, r.Player_ID
)
SELECT Player_ID
FROM num_games
WHERE ROWNUM = 1
Results:
| PLAYER_ID |
|-----------|
| 1 |

SELECT a single field by ordered value

Consider the following two tables:
student_id score date
-------------------------
1 10 05-01-2013
2 100 05-15-2013
2 60 05-01-2012
2 95 05-14-2013
3 15 05-01-2011
3 40 05-01-2012
class_id student_id
----------------------------
1 1
1 2
2 3
I want to get unique class_ids where the score is above a certain threshold for at least one student, ordered by the latest score.
So for instance, if I wanted to get a list of classes where the score was > 80, i would get class_id 1 as a result, since student 2's latest score was above > 80.
How would I go about this in t-sql?
Are you asking for this?
SELECT DISTINCT
t2.[class_ID]
FROM
t1
JOIN t2
ON t2.[student_id] = t1.[student_id]
WHERE
t1.[score] > 80
Edit based on your date requirement, then you could use row_number() to get the result:
select c.class_id
from class_student c
inner join
(
select student_id,
score,
date,
row_number() over(partition by student_id order by date desc) rn
from student_score
) s
on c.student_id = s.student_id
where s.rn = 1
and s.score >80;
See SQL Fiddle with Demo
Or you can use a WHERE EXISTS:
select c.class_id
from class_student c
where exists (select 1
from student_score s
where c.student_id = s.student_id
and s.score > 80
and s.[date] = (select max(date)
from student_score s1
where s.student_id = s1.student_id));
See SQL Fiddle with Demo
select distinct(class_id) from table2 where student_id in
(select distinct(student_id) from table1 where score > thresholdScore)
This should do the trick:
SELECT DISTINCT
CS.Class_ID
FROM
dbo.ClassStudent CS
CROSS APPLY (
SELECT TOP 1 *
FROM dbo.StudentScore S
WHERE CS.Student_ID = S.Student_ID
ORDER BY S.Date DESC
) L
WHERE
L.Score > 80
;
And here's another way:
WITH LastScore AS (
SELECT TOP 1 WITH TIES
FROM dbo.StudentScore
ORDER BY Row_Number() OVER (PARTITION BY Student_ID ORDER BY Date DESC)
)
SELECT DISTINCT
CS.Class_ID
FROM
dbo.ClassStudent CS
WHERE
EXISTS (
SELECT *
FROM LastScore L
WHERE
CS.Student_ID = L.Student_ID
AND L.Score > 80
)
;
Depending on the data and the indexes, these two queries could have very different performance characteristics. It is worth trying several to see if one stands out as superior to the others.
It seems like there could be some version of the query where the engine would stop looking as soon as it finds just one student with the requisite score, but I am not sure at this moment how to accomplish that.

Select column value where other column is max of group

I am trying to select two columns into a table (ID and state). The table should show the state with the maximum value for each ID. I've tried a few other examples but nothing seems to work.
Original data structure:
ID state value (FLOAT)
1 TX 921,294,481
1 SC 21,417,296
1 FL 1,378,132,290
1 AL 132,556,895
1 NC 288,176
1 GA 1,270,986,631
2 FL 551,374,452
2 LA 236,645,530
2 MS 2,524,536,050
2 AL 4,128,682,333
2 FL 1,503,991,028
The resulting data structure should therefore look like this:
ID STATE (Max Value)
1 FL
2 AL
Florida and Alabama having the largest values in their ID groups.
Any help would be greatly appreciated on this. I did find a SO answer here already, but could not make the answers work for me.
For SQL Server (and other products with windowed functions):
SELECT *
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY value desc) as rn
FROM
UnnamedTable
) t
WHERE
t.rn = 1
You can use a subquery to get this result:
select t1.id, t1.[state] MaxValue
from yourtable t1
inner join
(
select id, max(value) MaxVal
from yourtable
group by id
) t2
on t1.id = t2.id
and t1.value = t2.maxval
order by t1.id
See SQL Fiddle with Demo
A solution, based on the assumption that value is numeric:
SELECT
[ID],
[State],
[Value]
FROM
(
SELECT
[ID],
[State],
[Value],
Rank() OVER (PARTITION BY [ID] ORDER BY [Value] DESC) AS [Rank]
FROM [t1]
) AS [sub]
WHERE [sub].[Rank] = 1
ORDER BY
[ID] ASC,
[State] ASC
If multiple States with the same ID have the same Value, they would all get the same Rank. This is different from using Row_Number, which return unique row numbers, but the order is chosen arbitrarily. (See also: SQL RANK() versus ROW_NUMBER())