SQL multiple Joing Question, cant join 5 tables, problem with max - sql

I got 6 tables:
Albums
id_album | title | id_band | year |
Bands
id_band | name |style | origin
composers
id_musician | id_song
members
id_musician | id_band | instrument
musicians
id_musician | name | birth | death | gender
songs
id_song | title | duration | id_album
I need to write a query where I get the six bands with more members and of those bands, get the longest song duration and it's title.
So far, I can get the biggest bands:
SELECT bands.name, COUNT(id_musician) AS numberMusician
FROM bands
INNER JOIN members USING (id_band)
GROUP BY bands.name
ORDER BY numberMusician DESC
LIMIT 6;
I can also get the longest songs:
SELECT MAX(duration), songs.title, id_album, id_band
FROM SONGs
INNER JOIN albums USING (id_album)
GROUP BY songs.title, id_album, id_band
ORDER BY MAX(duration) DESC
The problem occurs when I am trying to write a subquery to get the band with the corresponding song and its duration. Trying to do it with inner joins also gets me undesired results. Could someone help me?
I have tried to put the subquery in the where, but I can't find how to do it due to MAX.
Thanks

I find that using lateral joins makre the query easier to write. You already have the join logic all right, so we just need to correlate the bands with the musicians the songs.
So:
select b.name, m.*, s.*
from bands b
cross join lateral (
select count(*) as cnt_musicians
from members m
where m.id_band = b.id_band
) m
cross join lateral (
select s.title, s.duration
from songs s
inner join albums a using (id_album)
where a.id_band = b.id_band
order by s.duration desc limit 1
) s
order by m.cnt_musicians desc
limit 6
For each band, subquery m counts the number of musicians per group (its where clause correlates to the outer query), while s retrieves the longest song, using correlation, order by and limit. The outer query just combines the information, and then orders selects the top 6 bands.

Related

SQL: how to select result of joined tables with limit based on main table rows quantity?

I have a table, movies.
+----+------+
| id | name |
+----+------+
And I have another table, genres.
+----+------+----------+
| id | name | movie_id |
+----+------+----------+
Guess, that every movie record have 3 genre records binded by movie_id.
I need to get 10 movie records with ALL genre records joined.
My query is:
select *
from movies
left join genres on movies.id = genres.movie_id
limit 10;
Result is 10 rows, but I want to get 10 movie rows with ALL genre rows joined = 30 rows.
What query should be?
you can select only 10 movies in a subquery:
select * from
(select * from movies limit 10) movies1
left join genres on movies1.id = genres.movie_id
Using GROUP BY you can retrieve one row pre movie with concatinated genres:
select movies.id, movies.name, GROUP_CONCAT(genres.name) genres
from movies
left join genres on movies.id = genres.movie_id
group by movies.id, movies.name
limit 10;
SQL group by fiddle

How to use partition on a join to get a count

I'm confused on how to get a count without using group by on a join
I know I can get the desired results using group by, but the table joins are long and lots of selected headers with case statement so I was hoping to avoid that
I'm sure I've seen this done before using partition over but can't find a good example using it on a join. Maybe it's not possible!?
I've tried
select
p.FirstName,
p.Surname,
count(pr.RelativePersonId) over (partition by pr.RelativePersonId) as [RelativesOnRecord]
from People p
left join PersonRelatives pr
on p.PersonId = pr.PersonId
For my tables:
People
PersonId | FirstName | Surname
1 Jim Bo
2 Harry Bo
3 Strong Bo
PersonRelatives
Id | PersonId | RelativePersonId
1 1 2
2 1 3
Where I'm trying to get
PersonId | FirstName | Surname | RelativesOnRecord
1 Jim Bo 2
I also tried joining with a SELECT TOP 1 but that just gives me the one result so one count. Is this even possible without group by?
It seems you are partitioning by the wrong column - you want to have the number of relatives for each person from People, right ? Use
count(pr.RelativePersonId) over (partition by pr.PersonId) as [RelativesOnRecord]
Based on your example, you want aggregation:
select p.PersonId, p.FirstName, p.Surname, count(*) as [RelativesOnRecord]
from People p join
PersonRelatives pr
on p.PersonId = pr.PersonId
group by p.PersonId, p.FirstName, p.Surname;
You could use apply or a correlated subquery, but window functions do not seem appropriate here.

postgres STRING_AGG() returns duplicates?

I have seen some similar posts, requesting advice for getting distinct results from the query. This can be solved with a subquery, but the column I am aggregating image_name is unique image_name VARCHAR(40) NOT NULL UNIQUE. I don't believe that should be necersarry.
This is the data in the spot_images table
spotdk=# select * from spot_images;
id | user_id | spot_id | image_name
----+---------+---------+--------------------------------------
1 | 1 | 1 | 81198013-e8f8-4baa-aece-6fbda15a0498
2 | 1 | 1 | 21b78e4e-f2e4-4d66-961f-83e5c28d69c5
3 | 1 | 1 | 59834585-8c49-4cdf-95e4-38c437acb3c1
4 | 1 | 1 | 0a42c962-2445-4b3b-97a6-325d344fda4a
(4 rows)
SELECT Round(Avg(ratings.rating), 2) AS rating,
spots.*,
String_agg(spot_images.image_name, ',') AS imageNames
FROM spots
FULL OUTER JOIN ratings
ON ratings.spot_id = spots.id
INNER JOIN spot_images
ON spot_images.spot_id = spots.id
WHERE spots.id = 1
GROUP BY spots.id;
This is the result of the images row:
81198013-e8f8-4baa-aece-6fbda15a0498,
21b78e4e-f2e4-4d66-961f-83e5c28d69c5,
59834585-8c49-4cdf-95e4-38c437acb3c1,
0a42c962-2445-4b3b-97a6-325d344fda4a,
81198013-e8f8-4baa-aece-6fbda15a0498,
21b78e4e-f2e4-4d66-961f-83e5c28d69c5,
59834585-8c49-4cdf-95e4-38c437acb3c1,
0a42c962-2445-4b3b-97a6-325d344fda4a,
81198013-e8f8-4baa-aece-6fbda15a0498,
21b78e4e-f2e4-4d66-961f-83e5c28d69c5,
59834585-8c49-4cdf-95e4-38c437acb3c1,
0a42c962-2445-4b3b-97a6-325d344fda4a
Not with linebreaks, I added them for visibility.
What should I do to retrieve the image_name's one time each?
If you don't want duplicates, use DISTINCT:
String_agg(distinct spot_images.image_name, ',') AS imageNames
Likely, there are several rows in ratings that match the given spot, and several rows in spot_images that match the given sport as well. As a results, rows are getting duplicated.
One option to avoid that is to aggregate in subqueries:
SELECT r.avg_raging
s.*,
si.image_names
FROM spots s
FULL OUTER JOIN (
SELECT spot_id, Round(Avg(ratings.rating), 2) avg_rating
FROM ratings
GROUP BY spot_id
) r ON r.spot_id = s.id
INNER JOIN (
SELECT spot_id, string_agg(spot_images.image_name, ',') image_names
FROM spot_images
GROUP BY spot_id
) si ON si.spot_id = s.id
WHERE s.id = 1
This actually could be more efficient that outer aggregation.
Note: it is hard to tell without seeing your data, but I am unsure that you really need a FULL JOIN here. A LEFT JOIN might actually be what you want.

SQL: Select count of a record in right table with joins

I have 2 tables one for mobiles and other is for reviews. Reviews table store the reviews of a specific mobile against its mobile id.
Structure of mobiles table.
mobile_id | mobile_name
Structure of reviews table.
review_id | mobile_id | review_body
So far I have written this query.
SELECT c.*, p.review_body
FROM ((select mobile_id, mobile_name from mobiles
WHERE brand_id=1 limit 0,5) c)
left JOIN
(
SELECT mobile_id,
MAX(review_id) MaxDate
FROM reviews
GROUP BY mobile_id
) MaxDates ON c.mobile_id = MaxDates.mobile_id left JOIN
reviews p ON MaxDates.mobile_id = p.mobile_id
AND MaxDates.MaxDate = p.review_id
This query returns the first 5 mobiles from mobile table and their latest (one) review from review table. This is the result it returns.
mobile_id | mobile_name | review_body
Question: But i also want review_count with it. review_count should be equal to total number of reviews a mobile has in reviews table against its mobile_id.
So please tell me how it can be done with a single query that I already have. Any help would be appreciated as i am trying to do this since 24 hours.
I think this would work
SELECT c.*, p.review_body, MaxDates.review_count
FROM ((select mobile_id, mobile_name from mobiles
WHERE brand_id=1 limit 0,5) c)
left JOIN
(
SELECT mobile_id,count(review_id) review_count,
MAX(review_id) MaxDate
FROM reviews
GROUP BY mobile_id
) MaxDates ON c.mobile_id = MaxDates.mobile_id left JOIN
reviews p ON MaxDates.mobile_id = p.mobile_id
AND MaxDates.MaxDate = p.review_id

SQL Arithmetic and joining three columns

I have a schema that looks like this:
+----------+
| tour |
+----------+
| id |
| name |
+----------+
+----------+
| golfer |
+----------+
| id |
| name |
| tour_id |
+----------+
+-----------+
| stat |
+-----------+
| id |
| round |
| score |
| golfer_id |
+-----------+
So essentially a golf tour has X number of golfers in it. A golfer will have X number of stats. The round column in the stat table just contains numbers (1, 2, 3, 4... and so on). They aren't necessarily one after the other but they are unique.
I now want to find all golfers that belong to the "PGA" tour and for each of those golfers, tally up their scores from the last 2 rounds. The last 2 rounds are essentially the rows in the stat table for the golfer with the biggest two numbers. So let's say golfer "Tiger Woods" has played in rounds 1, 3, 6 and 10, then I will only want to tally his scores from rounds 6 and 10. Another requirement is that I don't want to show golfers who are yet to have played in at least two rounds.
I've tried several ways to get this going but have always got myself into a tangle.
If you just want the last two rounds (emphasize on "two") there is a simple trick. This trick does not expand to getting more than two, or not the last two, records. For getting arbitrary records in a partition, you'll have to use window functions, which are more involved and only supported in newer versions of mainstream database engines.
The trick is to self-equal-join the "stat" table to itself on the golfer id. This way, you get all combinations of any two rounds of a golfer, including combinations with the same round:
SELECT s1.round as s1_round, s2.round AS s2_round
FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
Then you exclude (via a WHERE clause) the combinations that have the same rounds and also make sure that these combinations are always first round > second round. This means that now you have all combinations of any two rounds of a golfer, with no duplicates:
SELECT s1.round as s1_round, s2.round AS s2_round
FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round
Notice that if you select only the records for a particular golfer and sort DESC on the two round columns, the top row will be the last two rounds of that golfer:
SELECT TOP 1 s1.round as s1_round, s2.round AS s2_round
FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round
ORDER BY s1.round DESC, s2.round DESC
TOP 1 is SQL Server lingo to get the top row. For MySQL, you need to use LIMIT 1. For other databases, use the database engine's particular way.
However, in this case you can't do it so simply because you need the last two rounds of ALL golfers. You'll have to do more joins:
SELECT id,
(SELECT MAX(s1.round) FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round AND s1.golfer_id = golfer.id) AS last_round,
(SELECT MAX(s2.round) FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round AND s1.golfer_id = golfer.id) AS second_to_last_round
FROM golfer
This will give you the last two rounds (in two columns) for each golfer.
Or joining the golfer table with the two-column temp set should work also:
SELECT golfer.id, MAX(r.s1_round) AS last_round, MAX(r.s2_round) AS second_to_last_round
FROM golfer INNER JOIN
(
SELECT s1.golfer_id AS golfer_id, s1.round AS s1_round, s2.round AS s2_round
FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round
) r ON (r.golfer_id = golfer.id)
GROUP BY golfer.id
I leave it as a trivial exercise to join this query to the tour table to get golfers of the PGA tour, and to join this query back to the stats table to get the scores of the last two rounds.
HSQLDB 2.1 supports LATERAL joins, which allow this sort of select with arbitrary criteria.
A simple join will list all the golfers in the PGA tour:
select golfer.name from tour join golfer on (tour.id = tour_id and tour.name = 'PGA')
You then LATERAL join this table as many times as you need to the particular score. The next example includes the score for the last round (only if the play has played a round)
select golfer.name, firststat.score from tour join golfer on (tour.id = tour_id and tour.name = 'PGA' ),
lateral(select * from stat where golfer_id = golfer.id order by round desc limit 1) firststat
In the next example, you use one more lateral join to include the last but one round. If the player has not palyed two rounds, there will be no row for the player:
select golfer.name, secondstat.score score1, firststat.score score2 from tour join golfer on (tour.id = tour_id and tour.name = 'PGA' ),
lateral(select * from stat where golfer_id = golfer.id order by round desc limit 1 offset 1) secondstat,
lateral(select * from stat where golfer_id = golfer.id order by round desc limit 1) firststat
The LATERAL join does not need a WHERE clause, because the "where condition" is taken from the tables in the FROM list that appear before the current table. Therefore the SELECT statements in the subqueries of the LATERAL tables can use the golfer.id from the first joined table.