Recursive CTE Query to make groups without duplicates - sql

I need to build a sql query to find the least expensive squad (salary-wise) for teams in a European volleyball league.
For each team, the squad consists of 6 players. The entire team has 10 players (but could have more).
Positions are:
Libero
Opposite
Setter
Middle
Outside Hitter
Defensive Specialist
Here is a query result showing Players, Positions and Salaries. Note that some players can play multiple positions.
Salaries are in Euros, and are divided by 1000 to make things simpler.
Player Position Salary
--------------------------
Player A, Libero, 100
Player A, Defensive Specialist,100
Player B, Opposite, 200
Player C, Middle, 150
Player D, Outside Hitter, 175
Player D, Opposite, 150
Player E, Setter, 100
Player F, Setter, 150
Player G, Middle, 125
Player G, Opposite, 100
Player H, Libero, 75
Player I, Outside Hitter, 150
Player J, Defensive Specialist, 200
Since some players can fill multiple positions, I need weigh the benefit of moving them to another position to reduce the payroll, even if they are less expensive in the first position.
For example, if a player can fill Libero and Defensive Specialist positions, and as a Libero they are less expensive than other Liberos on the team, as a Defensive Specialist, they may even make the squad less expensive than if they were in the Libero position.
My first instinct is to generate all possible 6-person squads with all of the players (making sure that the same player is not in the same squad twice) - then ordering by the sum of salaries.
Doing this would find all combinations, and give me the least-expensive squad. But depending on the size of the team, this could be a very intensive task, so I wonder if there's a more efficient way.
I can find the first squad like this (though this doesn't check for duplicate player names in different positions):
WITH Squads AS
(
SELECT PlayerName, Salary, Position,
ROW_NUMBER() OVER (PARTITION BY Position ORDER BY Position) AS RowNumber
FROM Players
)
SELECT * FROM Squads WHERE RowNumber = 1 ORDER BY Position, Salary DESC, PlayerName
I believe I would then want to add a UNION ALL or a CROSS APPLY in the CTE to recurse through all players and positions, making sure that each squad has a player in each of the 6 positions.
What's the best method to do this?

This answer stores the position data in a separate table, and then recursively iterates over each position ID, joining previously unselected players who are listed under that position to the result. The running salary is stored, along with the players currently chosen for the given squad:
with recursive cte(pid, position, player, players, salary) as (
select 2, p.position, p1.player pl1, p1.player, p1.salary
from positions p join players p1 on p1.position = p.position where p.id = 1
union all
select c.pid + 1, p.position, p1.player, c.players||','||p1.player, c.salary + p1.salary
from cte c join positions p on p.id = c.pid join players p1 on p1.position = p.position where not c.players ~ p1.player
),
final_team as (
select c.players, c.salary from cte c where c.pid = 7
)
select f.* from final_team f where f.salary = (select min(f1.salary) from final_team f1)
See fiddle

The accepted answer works well. I had forgotten to mention what SQL flavor I was using - SQL Server 2019. The answer used PostgreSQL. Here is the SQL Server equivalent (also with Text fields changed to NVARCHAR):
with cte (pid, position, player, players, salary) as (
select 2, p.position, p1.player pl1, p1.player, p1.salary
from positions p join players p1 on p1.position = p.position where p.id = 1
union all
select c.pid + 1, p.position, p1.player, CONVERT(nvarchar(200),c.players+','+p1.player), c.salary + p1.salary
from cte c join positions p on p.id = c.pid join players p1 on p1.position = p.position where not (p1.player LIKE '%'+c.players+'%')
),
final_team as (
select c.players, c.salary from cte c where c.pid = 7
)
select f.* from final_team f where f.salary = (select min(f1.salary) from final_team f1)

Related

The difference between the minimum and maximum number of games

Question: Show the names of all players who have the following:
the difference between the minimum and maximum number of games
this players is greater than 5.
select p.name
from player p
join competition c
on c.playerID = p.playerID
where (
(select count(*) from competition
where count(games) > 1
group by playerID
) - (
select count(*) from competition
where count(games) <= 1
group by playerID
))> 5;
I'm kind of lost. I'm not so sure is this the right way, how I should proceed: should I use count and find the minimum and maximum number of games and compare with greater than 5 or should I use instead of count, min and max functions. Would be very grateful, if someone can explain me the logic of this.
Tables:
player competition
------- --------
playerID playerID
name games
birthday date
address
telefon
SELECT
P.Name,
MIN(C.Games) MinGame,
MAX(C.Games) MaxGame,
FROM Player P
INNER JOIN Competition C
ON C.PlayerId = P.PlayerId
GROUP BY P.Id, P.Name
HAVING MAX(C.Games) - MIN(C.Games) > 5
It should be a simple query:
With tab1 AS (Select player.name, min(games) mx_game, max(games) min_game,
max(games) - min(games) diff
from player JOIN competition ON player.player_id = competition.id
group by player.player_id, player.name)
Select tab1.name from tab1
WHERE diff >5;
I am adding player_id in the group by as player_name could be similar for 2 person.

select data that belongs to category

I have three tables like this:
table player:
id, name
table matchevent:
id, player_id, eventcategory_id, description
table eventcategory:
id, name
Relations are:
player : matchevent - 1:N
matchevent : eventcategory - N:1
In table player I have football player names and in eventcategory events, like yellow card, substitution etc. In table matchevent I store player events which belongs to some category.
And now I want to retrieve COUNT of events that all players have. For example:
first player has COUNT of yellow cards 0, substitution 0 and goals 0
second player has 1 yellow card, 0 substitution and 2 goals
third player have 0 yellow cards, 0 substitution 0 goals
etc
How can I do that in DQL? I tried LEFT JOIN and IN, but it doesn't work. It selects only players with events, not all players.
->createQuery('SELECT p, i
FROM MyBundle:player p
LEFT JOIN p.matchevent i
WHERE i.eventcategory IN (:eventcategory)
ORDER BY p.player ASC
')
->setParameters(array(
'eventcategory' => $eventcategory,
))
Use a right join that should force it to return all players even if they don't have events.
'SELECT p, i
FROM MyBundle:player p
RIGHT JOIN p.matchevent i
WHERE i.eventcategory IN (:eventcategory)
ORDER BY p.player ASC
')

SQL, Struggling to believe HAVING can be more efficient than a join

I Am trying to find Players that have played in more than one game for a team in the following tables (** denotes private key), and find it hard to believe the best query I can come up with (below) is the most efficient. Ideas on how to improve it, and explanations as to why would be much appreciated (Trying to learn SQL)
Team (*tid*, name)
Game (*gid*, tid)
Player (*gid*, *name*)
SELECT Team_Name, Player_Name
FROM (SELECT GID, TID FROM GAME) G
,(SELECT NAME AS Player_Name, GID FROM PLAYER) P
,(SELECT NAME AS Team_Name, TID FROM TEAM) T
WHERE ( G.GID = P.GID
AND Player_Name IN (SELECT P.NAME
FROM GAME G
,PLAYER P
WHERE G.GID = P.GID
GROUP BY P.NAME
HAVING COUNT(P.NAME) > 1)
AND T.TID = G.TID
)
GROUP BY Team_Name, Player_Name
HAVING COUNT(Player_Name) > 1
ORDER BY Team_Name
You're asking which players have played in more than a single game.
SELECT P.Player_Name
FROM Player P
GROUP BY P.Player_Name
HAVING COUNT(DISTINCT P.GID) > 1
That will return all players who have played in more than 1 game (GID).
If you'd like to also GROUP BY team, then do this:
SELECT P.Player_Name, T.Team_Name
FROM Player P
JOIN Game G ON P.GID = G.GID
JOIN Team T ON G.TID = T.TID
GROUP BY P.Player_Name, T.Team_Name
HAVING COUNT(DISTINCT G.GID) > 1
It seems odd to have the GID in the Player table. Perhaps having a PlayerGames table would make more sense that stored the PlayerId and GameId -- better for database normalization. The Player table should only store a single record for each player.
Also, what is the real association between the player and the team. In this scenario, you're saying a player has to play a game, and a game has to have a team (or should a game have 2 or more teams). Let us know what you're going for, and we could help present your best option.
Good luck.

Trouble With GROUP BY Learning Derby SQL

I have formed this query to produce the pitcher from each team who has the most wins. My trouble comes in that I need to group them a certain way and I keep having scoping issues when trying to do so. W is the number of wins for each pitcher. Here is my pre-grouped statement...
SELECT
(SELECT p1.nameFirst FROM Players p1 Where (one.playerID = p1.playerID)),
(SELECT p1.nameLast FROM Players p1 Where (one.playerID = p1.playerID)),
one.W, (SELECT t1.name FROM Teams t1 Where(one.teamID = t1.teamID))
FROM Pitching one
Where (one.W >= ALL
(SELECT two.W
FROM Pitching two
Where (two.teamID = one.teamID)));
I need to group the tuples by league and within the leagues group by division. League (lgID) and division (divID) exist in the Teams table. Can someone point me in the right direction? Thank you.
This is top six rows of what is currently output...
Zach Britton 11 Baltimore Orioles
Mark Buehrle 13 Chicago White Sox
Madison Bumgarner 13 San Francisco Giants
Jhoulys Chacin 11 Colorado Rockies
Bruce Chen 12 Kansas City Royals
Kevin Correia 12 Pittsburgh Pirates
My desired output is to have these teams sorted by league (NL/AL) and within the leagues have them sorted by division.
Based on your updated comment. I think this is what you want.
SELECT
pl.nameFirst
, pl.nameLast
, p.W
, t.name
FROM
(
SELECT
MAX(p1.W) AS W
, p1.teamId
FROM
Pitching p1
GROUP BY
p1.teamId
) t1
JOIN
Pitching p
ON t1.W = p.W
AND t1.teamId = p.teamId
JOIN
Players pl
ON p.playerID = pl.playerID
JOIN
Teams t
ON p.teamID = t.teamID
ORDER BY
t.lgID
, t.divID
I do agree with swasheck, there are some opportunities for improvement in your schema. As swascheck said, teamId should be in Players. Not in Pitching.
This may be an issue of structure. Using what I believe to be your structure we could probably narrow it down like this:
select Players.nameFirst, Players.nameLast, TopPitcher.Winner,
Teams.name, League.Name, Division.Name
from (select playerID, max(Wins) as Winner
from (select playerID, teamID, count(W) as Wins
from Pitching
group by playerID, teamID ) PitchingWins) TopPitcher
join Players
on TopPitcher.playerID = Players.playerID
join Teams
on Teams.teamID = Players.teamID
join League
on Teams.leagueID = League.leagueID
join Division
on League.divisionID = Division.divisionID
order by League.Name, Division.Name
Now. Having said that, this is only for the structure you've given (with some other interpolation). Your overall structure is faulty as I would probably relate Player to Teams and not Pitching to Teams since you might get some sort of data errors regarding team wins vs. pitcher wins.

how to write this query in sql

how to write this query in sql :
For every player that has played more than two games, list the player name, total amount of winnings and number of games played for each player". The result should be sorted by the winnings in descending order.
and i have in player table these attributes:
playerId,playerName,age
and in games table these attrubites:
gameId,playerId,results note the results attrubie is filled either by (first or second or third or,..,or no show) the winner is the one who has the result= first
this is my weak query i didn't got the right answer ,but that all what i can do . any idea
select playerName,count(*),count(*)
from games,player
where games.playerId=player.playerId
group by games.results
You want to look into GROUP BY and HAVING in conjunction with COUNT. Something like this would probably do (untested):
SELECT
p.playerName
,COUNT(g.*)
,SUM(g.Winnings) -- you didn't name this column
FROM
games g
INNER JOIN ON g.playerId = p.playerId
WHERE
g.results = 1 -- whatever indicates this player was the winner
GROUP BY
p.playerName
HAVING
COUNT(g.*) > 2
*Try this (pretty much as you said it in English...
(if "winnings" is amount won in the game), then:
Select playerName, count(*) Games, -- Number of game records per player
Sum(g.Winnings) Winnings -- Sum of a Winnings attribute (dollars ??)
from player p Join Games g -- from the two tables
On g.PlayerId = p.PlayerId -- connected using PlayerId
Group by p.playerName -- Output in one row per Player
Having Count(*) > 2 -- only show players w/more than 2 games
Order By Sum(g.Winnings) -- sort the rows based on Player Winnings
if by "Winnings" you mean the number of games won, then...
Select playerName, Count(*) Games, -- Number of game records per player
Sum(Case g.WonTheGame -- or whatever attribute is used
When 'Y' Then 1 -- to specify that player won
Else 0 End) Wins -- Output in one row per Player
From player p Join Games g -- from the two tables
On g.PlayerId = p.PlayerId -- connected using PlayerId
Group by p.playerName -- Output in one row per Player
Having Count(*) > 2 -- only show players w/more than 2 games
Order By Sum(Case g.WonTheGame -- Sort by Number of games Won
When 'Y' Then 1
Else 0 End)
Try this :
SELECT playerName, COUNT(g.PlayerID) as NumberOfPlays
FROM games g ,player p
WHERE g.playerId=p.playerId
GROUP BY g.PlayerID
HAVING COUNT(g.PlayerID) > 1
ORDER BY g.results DESC
SELECT - the data you want to display
FROM - the tables
WHERE - both IDs match each other
GROUP BY - Games PlayerID, so all the counts are correct
HAVING - Make sure they played more then one game
ORDER BY - Order the results the way you want them.
it's tough to glean exactly what you need from your question but try something like this:
select playerName, count(*)
from games g
join player p ON g.playerId = p.playerId
group by playerName
having count(*) > 2
order by games.results DESC
select
playerName,
sum(if(games.result = 'first',1,0)) as wins,
count(*) as gamesPlayed
from player
join games on games.playerId = player.playerId
group by games.results
having count(*) > 2
order by count(*) desc;