SQL Two aggregate functions over three tables - sql

I'm doing a database project and I've spent about 3 hours confusing myself so I thought I'd try to get some help on here.
I have tables for music genres, competitions and entries to the competitions.
I need to work out how many competitions there are for each genre and how many entries there are for each genre.
My aggregate functions for competitions and entries work when I only have one in the query but when I have them both in the same query I get the results for the entries column in competitions as well and I have no idea what I'm doing wrong, it's probably something stupid and simple.
Here's my query:
SELECT Genre.Genre, count(Competition.Genre_ID)Competitions, count(Comp_Entry.Comp_ID)Bands
FROM Genre, Competition, Comp_Entry
WHERE Genre.Genre_ID = Competition.Genre_ID
AND Comp_Entry.Comp_ID = Competition.Comp_ID
GROUP BY Competition.Genre_ID, Genre.Genre
ORDER BY Genre.Genre;
Can anyone see what I'm doing wrong?
Thanks.

You should use proper join syntax. But the problem you are facing is that you need count(distinct) rather than count():
SELECT Genre.Genre, count(Competition.Genre_ID) as Competitions,
count(distinct Comp_Entry.Comp_ID) as Bands
FROM Genre join
Competition
on Genre.Genre_ID = Competition.Genre_ID join
Comp_Entry
on Comp_Entry.Comp_ID = Competition.Comp_ID
GROUP BY Competition.Genre_ID, Genre.Genre
ORDER BY Genre.Genre;

Related

PostgresSQL: Mulltiple table joins, query joined data

I am working on a class question so please dont provide a direct answer but some guidance please.
I have joined 3 tables (businesses, categories, and countries). I need to now answer the following question: "Which are the most common categories for the oldest businesses on each
continent?"
Here is the code that works to join the tables:
SELECT bus.business,
bus.year_founded,
bus.category_code,
bus.country_code,
cat.category,
cou.continent
FROM businesses AS bus
INNER JOIN categories AS cat
ON bus.category_code = cat.category_code
INNER JOIN countries AS cou
ON bus.country_code = cou.country_code
ORDER BY year_founded
I have the 3 tables joined but now need to run an additional query to answer the above question. I have tried writing the COUNT or MIN functions with the initial SELECT for the joins however that throws an error. I also tried using GROUP BY function at the end of the join but also get an error.
Any suggestions would be appreciated

SQL Count & Join

Just not sure what I need to do here. I have to count the total movies based off a certain production company, so the question is this:
How many movies in the database were produced by Pixar Animation Studios?
This is my SQL code so far, I work off Jupyter:
select movies.movie_id, movies.title, productioncompanies.production_company_id, productioncompanies.production_company_name
from movies, productioncompanies
where production_company_name = "Pixar Animation Studios"
A possible solution is the following:
select count(*)
from movies join productioncompanies
on movies.production_company_id = productioncompanies.production_company_id
where production_company_name = 'Pixar Animation Studios';
%%sql
select count(*), p.production_company_name
from productioncompanymap as m
left join productioncompanies as p
on m.production_company_id = p.production_company_id
where production_company_name = 'Pixar Animation Studios'
I apologize for the vague detail to my question. I am still new to stack overflow so still finding my feet.
I have to link two databases of movies. The first database has a set of movie details (movie name, budget, rating, release date and reviews as well as a movie_id), the second table has a genre_id and genre's listing. I need to link the movie to the production comapny i.e. Monsters inc. to Pixar Animations, but the two databases does not have a primary key to link each other to. When I have the list of movies linked to the production company, I have to count the total movies per production company.
I hope this gives more detail, oh yeah, I am a student and this is one of my tests I have to do. I am unsure of how to join the tables and where.

Aggregate function returning different result compared to the algebraic expression of rows

I´ve recently started with SQL and would like to know the following:
I have a problem I´m trying to solve, and both my solution and the official resolution are very similar, yet return very different results, the problem consists of a table with the following values
:
I am supposed to "find the total domestic and international sales that can be attributed to each director"
Here is what I wrote :
SELECT Director,(Domestic_sales + International_sales) AS Total_sales From Movies
INNER JOIN Boxoffice
ON Movies.Id=Boxoffice.Movie_Id
GROUP BY Director
And here is the official solution:
SELECT director, SUM(domestic_sales + international_sales) as Cumulative_sales_from_all_movies
FROM movies
INNER JOIN boxoffice
ON movies.id = boxoffice.movie_id
GROUP BY director;
I understand that the SUM aggregated function will do the trick, but why does simply adding up the values as I did return a different value? Issit because it's not taking into consideration the different films, but just adding up one of the lists in the movie and throwing that result?
I´ve looked elsewhere and also checked other questions seeing if I could answer this, but to no avail.
Thanks everyone and have a great week!
why does simply adding up the values as I did return a different value? Issit because it's not taking into consideration the different films, but just adding up one of the lists in the movie and throwing that result?
Your query is not standard SQL. In fact, this will return an error in almost all databases, including the most recent versions of MySQL, because you are aggregating by one column but have other non-aggregated columns in the SELECT:
SELECT m.Director, (bo.Domestic_sales + bo.International_sales) AS Total_sales
FROM Movies m JOIN
Boxoffice bo
ON M.Id = bo.Movie_Id
GROUP BY m.Director;
(I added table aliases, which are highly recommended.)
In older versions of MySQL, this returns an arbitrary value for the sum -- from one matching row. The equivalent in more recent versions uses ANY_VALUE():
SELECT m.Director, ANY_VALUE(bo.Domestic_sales + bo.International_sales) AS Total_sales
FROM Movies m JOIN
Boxoffice bo
ON M.Id = bo.Movie_Id
GROUP BY m.Director;
Obviously, an arbitrary value is different from a SUM().
I would advise you to set the session to avoid this problem. You can set ONLY_FULL_GROUP_BY to get the standard and compatible behavior.

SQL Join query brings multiple results

I have 2 tables. One lists all the goals scored in the English Premier League and who scored it and the other, the squad numbers of each player in the league.
I want to do a join so that the table sums the total number of goals by player name, and then looks up the squad number of that player.
Table A [goal_scorer]
[]1
Table B [squads]
[]2
I have the SQL query below:
SELECT goal_scorer.*,sum(goal_scorer.number),squads.squad_number
FROM goal_scorer
Inner join squads on goal_scorer.name=squads.player
group by goal_scorer.name
The issue I have is that in the result, the sum of 'number' is too high and seems to include duplicate rows. For example, Aaron Lennon has scored 33 times, not 264 as shown below.
Maybe you want something like this?
SELECT goal_scorer.*, s.total, squads.squad_number
FROM goal_scorer
LEFT JOIN (
SELECT name, sum(number) as total
FROM goal_scorer
GROUP BY name
) s on s.name = goal_scorer.name
JOIN squads on goal_scorer.name=squads.player
There are other ways to do it, but here I'm using a sub-query to get the total by player. NB: Most modern SQL platforms support windowing functions to do this too.
Also, probably don't need the left on the sub-query (since we know there will always be at least one name), but I put it in case your actual use case is more complicated.
Can you try this if you are using sql-server?
select *
from squads
outer apply(
selecr sum(goal_scorer.number) as score
from goal_scorer where goal_scorer.name=squads.player
)x

correlated query to update a table based on a select

I have these tables Genre and Songs. There is obviously many to many relationship btw them, as one genre can have (obviously) have many songs and one song may belong to many genre (say there is a song xyz, it belong to rap, it can also belong to hip-hop). I have this table GenreSongs which acts as a many to many relationship map btw these two, as it contains GenreID and SongID column. So, what I am supposed to do this, add a column to this Genre table named SongsCount which will contain the number of songs in this genre. I can alter table to add a column, also create a query that will give the count of song,
SELECT GenreID, Count(SongID) FROM GenreSongs GROUP BY GenreID
Now, this gives us what we require, the number of songs per genre, but how can I use this query to update the column I made (SongsCount). One way is that run this query and see the results, and then manually update that column, but I am sure everyone will agree that's not a programmtic way to do it.
I came to think I would require to create a query with a subquery, that would get the value of GenreID from outer query and then count of its value from inner query (correlated query) but I can't make any. Can any one please help me make this?
The question of how to approach this depends on the size of your data and how frequently it is updated. Here are some scenarios.
If your songs are updated quite frequently and your tables are quite large, then you might want to have a column in Genre with the count, and update the column using a trigger on the Songs table.
Alternatively, you could build an index on the GenreSong table on Genre. Then the following query:
select count(*)
from GenreSong gs
where genre = <whatever>
should run quite fast.
If your songs are updated infrequently or in a batch (say nightly or weekly), then you can update the song count as part of the batch. Your query might look like:
update Genre
set SongCnt = cnt
from (select Genre, count(*) as cnt from GenreCount gc group by Genre) gc
where Genre.genre = gc.Genre
And yet another possibility is that you don't need to store the value at all. You can make it part of a view/query that does the calculation on the fly.
Relational databases are quite flexible, and there is often more than one way to do things. The right approach depends very much on what you are trying to accomplish.
Making a table named SongsCount is just plainly bad design (redundant data and update overhead). Instead use this query for single results:
SELECT ID, ..., (SELECT Count(*) FROM GenreSongs WHERE GenreID = X) AS SongsCount FROM Genre WHERE ID = X
And this for multiple results (much more efficient):
SELECT ID, ..., SongsCount FROM (SELECT GenreID, Count(*) AS SongsCount FROM GenreSongs GROUP BY GenreID) AS sub RIGHT JOIN Genre AS g ON sub.GenreID = g.ID