SUM() acting weird when I select from another table - sql

So when I run the following query:
SELECT master.playerID, master.nameFirst, master.nameLast, SUM(managers.G) AS games, SUM(managers.W) AS wins
FROM master, managers
WHERE managers.playerID = master.playerID
AND managers.playerID = 'lemonbo01'
GROUP BY managers.playerID;
I get the appropriate sum of games and sum of wins. But the moment I include another table, in this instance the pitching table, like this:
SELECT master.playerID, master.nameFirst, master.nameLast, SUM(managers.G) AS games, SUM(managers.W) AS wins
FROM master, managers, pitching
WHERE managers.playerID = master.playerID
AND managers.playerID = 'lemonbo01'
GROUP BY managers.playerID;
Although I'm not changing anything about the query except selecting from one more table the wins and games change to absurd numbers. What is exactly causing this?
Thanks in advance.

Related

Getting win % from a table that holds "Winners" and "Losers"

I am working with an SQLite database where I store matches/fights between players like so:
matchId[int] winner[text] loser[text]
I have made queries that sum up how many times a player has won a fight and another one for how many fights a player has lost. But is there a way, in SQL, to type this so that I can find the win% directly from the database or do I have to calculate that elsewhere? There is no problem calculating this elsewhere, but I got intrigued to figure out if/how it can be done purely in SQL.
What I am trying to achieve is basically:
SELECT winner, COUNT(winner) as Wins FROM Fights GROUP BY winner
divided by
SELECT loser, COUNT(loser) as Losses FROM Fights GROUP BY loser;
for each player, which in this table is either a "winner" or a "loser". I also have a table (Players) that holds all these players as "player" that could be utilized to make this work.
You can use union all and aggregation:
select player, avg(win) as win_ratio
from (
select winner as player, 1.0 as win from fights
union all
select loser, 0 from fights
) t
group by player
This gives you, for each player that participated at least one fight, a decimal number between 0 and 1 that represents the win ratio.

Grouping a percentage calculation in postgres/redshift

I keep running in to the same problem over and over again, hoping someone can help...
I have a large table with a category column that has 28 entries for donkey breed, then I'm counting two specific values grouped by each of those categories in subqueries like this:
WITH totaldonkeys AS (
SELECT donkeybreed,
COUNT(*) AS total
FROM donkeytable1
GROUP BY donkeybreed
)
,
sickdonkeys AS (
SELECT donkeybreed,
COUNT(*) AS totalsick
FROM donkeytable1
JOIN donkeyhealth on donkeytable1.donkeyid = donkeyhealth.donkeyid
WHERE donkeyhealth.sick IS TRUE
GROUP BY donkeybreed
)
,
It's my goal to end up with a table that has primarily the percentage of sick donkeys for each breed but I always end up struggling like hell with the problem of not being able to group by without using an aggregate function which I cannot do here:
SELECT (CAST(sickdonkeys.totalsick AS float) / totaldonkeys.total) * 100 AS percentsick,
totaldonkeys.donkeybreed
FROM totaldonkeys, sickdonkeys
GROUP BY totaldonkeys.donkeybreed
When I run this I end up with 28 results for each breed of donkey, one correct I believe but obviously hundreds of useless datapoints.
I know I'm probably being really dumb here but I keep hitting in to this same problem again and again with new donkeydata, I should obviously be structuring the whole thing a new way because you just can't do this final query without an aggregate function, I think I must be missing something significant.
You can easily count the proportion that are sick in the donkeyhealth table
SELECT d.donkeybreed,
AVG( (dh.sick)::int ) AS proportion_sick
FROM donkeytable1 d JOIN
donkeyhealth dh
ON d.donkeyid = dh.donkeyid
GROUP BY d.donkeybreed

Query complex in Oracle SQL

I have the following tables and their fields
They ask me for a query that seems to me quite complex, I have been going around for two days and trying things, it says:
It is desired to obtain the average age of female athletes, medal winners (gold, silver or bronze), for the different modalities of 'Artistic Gymnastics'. Analyze the possible contents of the result field in order to return only the expected values, even when there is no data of any specific value for the set of records displayed by the query. Specifically, we want to show the gender indicator of the athletes, the medal obtained, and the average age of these athletes. The age will be calculated by subtracting from the system date (SYSDATE), the date of birth of the athlete, dividing said value by 365. In order to avoid showing decimals, truncate (TRUNC) the result of the calculation of age. Order the results by the average age of the athletes.
Well right now I have this:
select person.gender,score.score
from person,athlete,score,competition,sport
where person.idperson = athlete.idathlete and
athlete.idathlete= score.idathlete and
competition.idsport = sport.idsport and
person.gender='F' and competition.idsport=18 and score.score in
('Gold','Silver','Bronze')
group by
person.gender,
score.score;
And I got this out
By adding the person.birthdate field instead of leaving 18 records of the 18 people who have a medal, I'm going to many more records.
Apart from that, I still have to draw the average age with SYSDATE and TRUNC that I try in many ways but I do not get it.
I see it very complicated or I'm a bit saturated from so much spinning, I need some help.
Reading the task you got, it seems that you're quite close to the solution. Have a look at the following query and its explanation, note the differences from your query, see if it helps.
select p.gender,
((sysdate - p.birthday) / 365) age,
s.score
from person p join athlete a on a.idathlete = p.idperson
left join score s on s.idathlete = a.idathlete
left join competition c on c.idcompetition = s.idcompetition
where p.gender = 'F'
and s.score in ('Gold', 'Silver', 'Bronze')
and c.idsport = 18
order by age;
when two dates are subtracted, the result is number of days. Dividing it by 365, you - roughly - get number of years (as each year has 365 days - that's for simplicity, of course, as not all years have that many days (hint: leap years)). The result is usually a decimal number, e.g. 23.912874918724. In order to avoid that, you were told to remove decimals, so - use TRUNC and get 23 as the result
although data model contains 5 tables, you don't have to use all of them in a query. Maybe the best approach is to go step-by-step. The first one would be to simply select all female athletes and calculate their age:
select p.gender,
((sysdate - p.birthday) / 365 age
from person p
where p.gender = 'F'
Note that I've used a table alias - I'd suggest you to use them too, as they make queries easier to read (table names can have really long names which don't help in readability). Also, always use table aliases to avoid confusion (which column belongs to which table)
Once you're satisfied with that result, move on to another table - athlete It is here just as a joining mechanism with the score table that contains ... well, scores. Note that I've used outer join for the score table because not all athletes have won the medal. I presume that this is what the task you've been given says:
... even when there is no data of any specific value for the set of records displayed by the query.
It is suggested that we - as developers - use explicit table joins which let you to see all joins separated from filters (which should be part of the WHERE clause). So:
NO : from person p, athlete a
where a.idathlete = p.idperson
and p.gender = 'F'
YES: from person p join athlete a on a.idathlete = p.idperson
where p.gender = 'F'
Then move to yet another table, and so forth.
Test frequently, all the time - don't skip steps. Move on to another one only when you're sure that the previous step's result is correct, as - in most cases - it won't automagically fix itself.

SQL Pivot Total Column

I am building an NFL Pickem application. The query in question should show a list of all players in the league along with the team they picked to win each game in a given week. This sample is hard coded for week 1 to keep it simpler and focus on my main question.
I am getting stuck trying to add an additional column for Total Points. This Total Points column should perform a calculation based on the Pick.ConfidencePoints column for each player for the given week.
The query below is working the way I want except for the Total Points column.
Whenever I try to add that column things get messed up.
The query currently produces results that look like this:
Here is the current query:
SELECT Player, [1],[2],[3]
FROM
(SELECT
Player.Name AS Player,
Game.Week,
Team.CityShort,
Game.ID AS GameId
FROM Pick
LEFT JOIN Player ON Pick.PlayerId = Player.Id
LEFT JOIN Team ON Pick.PickedWinnerTeamId = Team.Id
LEFT JOIN Game ON Pick.GameId = Game.Id
WHERE Game.Week = 1
GROUP BY Player.Name, Game.Week, Team.CityShort, Game.Id) AS SourceData
PIVOT
(
MAX (CityShort)
FOR GameId IN ([1],[2],[3])
) AS PivotTable

MS ACCESS SQL Query two tables getting varius data

I am new to SQL and am having trouble setting up this query. I have two tables, one which holds info about the teams, named TEAMS which looks like this:
TEAMS
Name|City|Attendance
Jets| NY| 50
...
And the other which holds info about the games played, named GAMES:
GAMES
Home|Visitors|Date |Result
Jets| Broncos| 1/1/2012| Tie
...
For this specific query I need to find each team that had one or more home games, give the name of the team, the number of wins, the number of losses, and the number of ties. I'm having trouble figuring out how to combine the data, I have made several queries that individually find the amount of losses, wins and ties but I don't know how to join properly or that even is the right approach. Thanks!
This should get you pretty close but without understanding your data fully I can't really give you a perfect working query, but at least you can see what the join might look like.
SELECT TeamName, SUM(SWITCH(Result = 'Win', 1)) AS Wins, SUM(SWITCH(Result = 'Tie', 1)) AS Ties, SUM(SWITCH(Result = 'Loss', 1)) AS Loss
FROM Teams INNER JOIN Games ON (Teams.TeamName = Games.Home OR Teams.TeamName = Games.Visitors)
GROUP BY TeamName
HAVING MAX(SWITCH(Teams.TeamName = Games.Home, 1)) = 1;
It'd be better database design to have IDs instead of team names in the games table, and also having a description like "Tie", "Win", "Loss" I wasn't sure which team that'd refer to (obviously tie is easy), so right now the query just takes whatever is in that column, which I'm sure is incorrect, but it should be a small change to fix it.