SQL - Two columns group by issue with MAX function - sql

SELECT artist.name, recording.name, MAX(recording.length)
FROM recording
INNER JOIN (artist_credit
INNER JOIN (artist_credit_name
INNER JOIN artist
ON artist_credit_name.artist_credit=artist.id)
ON artist_credit_name.artist_credit=artist_credit.id)
ON recording.artist_credit=artist_credit.id
WHERE artist.gender=1
AND recording.length <= (SELECT MAX(recording.length) FROM recording)
GROUP BY artist.name, recording.name
ORDER BY artist.name
We are using the MusicBrainz database for school and we are having troubles with the "GROUP BY" because we have two columns (it works with one column, but not two). We want the result to display just one artist with his second longest recording time, but the code displays all the recording time of every song of the same artist.
Any suggestions? Thanks.

You don't need to do multiple joins looking closely at the join conditions. They can be reduce to just one join as shown below.
SELECT DISTINCT B.name, A.name, A.length
FROM recording A JOIN artist B
ON A.artist_credit=B.id
WHERE B.gender=1
AND A.length=(SELECT C.length FROM recording C
WHERE C.artist_credit=B.artist_credit
ORDER BY C.length LIMIT 1, 1)
ORDER BY B.name;
See Using MySQL LIMIT to get the nth highest value

As others have pointed out, the join statement can be reduced. Also there seems to be a problem with the operator in the AND statement; it should be < and not <= in order to get the second highest length (Se here: What is the simplest SQL Query to find the second largest value?).
I would suggest trying out the following:
SELECT artist.name, recording.name, MAX(recording.length)
FROM recording
JOIN artist ON recording.artist_credit = artist.id
WHERE
artist.gender=1
AND
recording.length < (SELECT MAX(recording.length) FROM recording)
GROUP BY artist.name
ORDER BY artist.name

Related

SQL retrieve info from 2 different ID in a single query

Let's say I have a table that represents soccer matches (X_train in this case) with an away_team_id and a home_team_id, those id points to another table 'team_attributes'.
What I managed to do with a query is to select the attributes of only one of the team but I'm interested in getting both team's attributes.
This is the query I'm using now :
SELECT
X_Train.* , Team_Attributes.*, MAX(Team_Attributes.date)
FROM
X_Train
LEFT JOIN
Team_Attributes ON X_Train.home_team_api_id = Team_Attributes.team_api_id
AND Team_Attributes.date <= X_Train.date
GROUP BY
X_Train.id
ORDER BY
X_Train.date
This works fine but I need to get the same join on the X_train.away_team_api_id, is there an easy way to do this ? I tried using UNION but maybe I didn't look far enough in that direction.
Thank you
You need a second join. For that -- and to simplify the query -- use table aliases:
SELECT t.*, hta.*, ata.*
FROM X_Train t LEFT JOIN
Team_Attributes hta
ON t.home_team_api_id = hta.team_api_id AND
hta.date <= t.date LEFT JOIN
Team_Attributes ata
ON t.away_team_api_id = ata.team_api_id AND
ata.date <= t.date
ORDER BY t.date DESC;
I don't understand what the GROUP BY is doing, so I removed it. Your question appears to be about JOIN logic anyway.

Currently using a query to order another query; can these be combined?

Right now I have a form that queries a query in order to sort it. I know it is possible to order by more than one criteria, but when I try to add the second order criteria to the query it doesn't seem to do anything.
This is the Random LP Picker query. It gives me a random selection of 10 LPs from a list of hundreds:
SELECT TOP 10 Artists.OriginalName, LPs.Album, LPs.rating, LPs.[Notable
Songs], LPs.Comments, [Listened (LPs)].last_date, [Listened
(LPs)].times_listened, LPs.LPID
FROM (LPs INNER JOIN Artists ON LPs.Performer_id = Artists.ArtistID) INNER
JOIN [Listened (LPs)] ON LPs.LPID = [Listened (LPs)].disc_id
WHERE (((LPs.Status)=1 Or (LPs.Status)=5) AND ((LPs.[Media Type])=1))
ORDER BY Rnd(Int(Now()*[LPID])-Now()*[LPID]);
In my form I order the query by date like this:
SELECT [Random LP Picker].*
FROM [Random LP Picker]
ORDER BY [Random LP Picker].last_date;
I tried putting both sorts in the Random LP Picker query, so it looks like:
ORDER BY Rnd(Int(Now()*[LPID])-Now()*[LPID]), [Listened (LPs)].last_date;
Doing that does not give me the list sorted by last_date. I also tried reversing those two sort items, but that causes several fields to not appear at all for reasons beyond my limited knowledge.
It would be useful if I could do this all in a single query. Is it possible?
NOTE: A couple of people have said, why not just order by date. The thing is, what this query is doing is randomly ordering all the entries and then returning the first 10. So if I remove the order by Rnd(...) part then I am no longer getting 10 random entries. If there's another way to get 10 random entries without using ORDER then I'd be happy to do that, but this is the only way I know to do it.
Consider moving query into a derived table (i.e., subquery in FROM) and sort on outer query:
SELECT main.*
FROM
(SELECT TOP 10 a.OriginalName, l.Album, l.rating, l.[Notable Songs],
l.Comments, p.last_date, p.times_listened, l.LPID
FROM (LPs l INNER JOIN Artists a ON l.Performer_id = a.ArtistID)
INNER JOIN [Listened (LPs)] p ON l.LPID = p.disc_id
WHERE ((l.Status IN (1,5)) AND (l.[Media Type]=1))
ORDER BY Rnd(Int(Now()*l.[LPID])-Now()*l.[LPID])
) As main
ORDER BY main.last_date
Since you want your final result-set to be sorted on last_date, hence your last ORDER BY should be on [Listened (LPs)].last_date. A sub-query should return TOP 10 rows based on random numbers.
You can try replacing line
ORDER BY Rnd(Int(Now()*[LPID])-Now()*[LPID]), [Listened (LPs)].last_date;
with
[Listened (LPs)].last_date;
The modified query will be like:
SELECT Artists.OriginalName, LPs.Album, LPs.rating, LPs.[Notable
Songs], LPs.Comments, [Listened (LPs)].last_date, [Listened
(LPs)].times_listened, LPs.LPID
FROM (LPs INNER JOIN Artists ON LPs.Performer_id = Artists.ArtistID) INNER
JOIN [Listened (LPs)] ON LPs.LPID = [Listened (LPs)].disc_id
WHERE (((LPs.Status)=1 Or (LPs.Status)=5) AND ((LPs.[Media Type])=1))
AND LPs.LPID IN (SELECT TOP 10 LPID FROM LPs ORDER BY Rnd(Int(Now()*[LPID])-Now()*[LPID]))
ORDER BY [Listened (LPs)].last_date;

Need to find average and number of repetitions of column

I have an SQL sentence :
SELECT application.id,title,url,company.name AS company_name,package_name,ranking,date,platform,country.name AS country_name,collection.name AS collection_name,category.name AS category_name FROM application
JOIN application_history ON application_history.application_id = application.id
JOIN company ON application.company_id = company.id
JOIN country ON application_history.country_id = country.id
JOIN collection ON application_history.collection_id = collection.id
JOIN category ON application_history.category_id = category.id
WHERE application.platform=0
AND country.name ='CZ'
AND collection.name='topfreeapplications'
AND category.name='UTILITIES'
AND application_history.ranking <= 10
AND date::date BETWEEN date (CURRENT_DATE - INTERVAL '1 month') AND CURRENT_DATE
ORDER BY application_history.ranking ASC
It produces this result :
I'd like to add both a column average ranking for a given package, and a column number of appearances, which would count the number a package appears in the list. I'd also like to Group results by package_name, so that I don't have redundancies.
So far, I've tried to add a GROUP BY By clause before the ORDER BY :
GROUP BY package_name
But it returns me an error :
column "application.id" must appear in the GROUP BY clause or be used in an aggregate function
If I add each and every column it asks me for, it doesn't work.
I have also tried to count the number of package names, by adding after the SELECT :
COUNT(package_name) AS count
It produces a similar error.
How could I get the result I'm looking for ? Should I make two queries instead, or is it possible to get everything at once ?
I precise I have looked at other answers on S.O, but none of them tries to make the COUNT on a "produced" column.
Thank you for your help.
Edit :
Here is the result I expected at first :
Although Gordon's advice didn't give me the proper result it put me on the good track, when I read this :
From the docs : "Unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row."
So I came back to using COUNT and AVG alone. My problem was that I wanted to display the ranking column and date to check whether things were right. But putting these column into the Select prevented the GROUP BY to work as expected, as mentioned by Jarlh in the comments.
The working query :
SELECT application.id,title,url,company.name AS company_name,package_name,platform,country.name AS country_name,collection.name AS collection_name,category.name AS category_name,
COUNT(package_name) AS count, AVG(application_history.ranking) AS avg
FROM application
JOIN application_history ON application_history.application_id = application.id
JOIN company ON application.company_id = company.id
JOIN country ON application_history.country_id = country.id
JOIN collection ON application_history.collection_id = collection.id
JOIN category ON application_history.category_id = category.id
WHERE application.platform=0
AND country.name ='CZ'
AND collection.name='topfreeapplications'
AND category.name='UTILITIES'
AND application_history.ranking <= 10
AND date::date BETWEEN date (CURRENT_DATE - INTERVAL '1 month') AND CURRENT_DATE
GROUP BY package_name,application.id,company.name,country.name,collection.name,category.name
ORDER BY count DESC
I think you want window/analytic functions. The following adds two columns, one for the count of rows for each package and the other an average ranking for them:
SELECT application.id, title, url, company.name AS company_name, package_name,
ranking, date, platform, country.name AS country_name,
collection.name AS collection_name, category.name AS category_name,
count(*) over (partition by package_name) as count,
avg(ranking) over (partition by package_name) as avg_package_ranking
FROM application . . .

Sum(IIF( including results with 0 count

Hey All i am using sum iff to return a count based on multiple criteria.
i am basically running a report on calls recieved per site, however i need sites with 0 calls included in the result set, with the value of 0 or even Null, if they have no calls for that week.
only issue is that my where cluase has only included sites that have had calls in the week
Any ideas.
Code:
SELECT
d.sitename,
count(c.Chargeablecalls) AS All_Calls,
SUM(IIf(c.ChargeableCalls Like "Chargeable",1,0)) AS Chargeable_calls,
d.sitetype
FROM
(Callstatus AS s LEFT JOIN statusconversion AS c ON s.description=c.reportheading)
INNER JOIN sitedetails AS d ON s.zone=d.zone
WHERE s.date_loaded BETWEEN
(SELECT reportdate FROM reportMonth) AND (SELECT priorweek FROM reportMonth)
GROUP BY d.sitename, d.sitetype;
You need a RIGHT JOIN for sitedetails in order to get all the sites even those with no calls.
You may need to do the first half of query separately and then use that query in the main query.
create a new query - qryCallStatus:
SELECT DISTINCT zone, description
FROM Callstatus, reportMonth
WHERE
Callstatus.date_loaded BETWEEN reportMonth.reportdate AND reportMonth.priorweek;
Then change your output query to:
SELECT
d.sitename,
count(c.Chargeablecalls) AS All_Calls,
SUM(IIf(c.ChargeableCalls Like "Chargeable",1,0)) AS Chargeable_calls,
d.sitetype
FROM
(sitedetails AS d LEFT JOIN qryCallStatus AS s ON d.zone=s.zone)
LEFT JOIN statusconversion AS c ON s.description=c.reportheading
GROUP BY d.sitename, d.sitetype;

SQL SELECT query with JOIN, SUM and GROUP BY

I have 5 tables in a MS Access databse: tblMember, tblPoint, tblRace, tblRaceType and tblResult. (All of which have primary keys.)
tblPoint contains (RaceTypeID, Position, Points) fields.
What I want to do is look at all the races that the members participated in, see what position they came (stored in tblResult) and see if those positions score points (as defined in tblPoint). I then want to add up all the points for each member and show these, along with the member name in my query...
Is this possible? I came up with my best shot at this SQL query below:
SELECT Sum(tblPoint.Points) AS SumOfPoints, Count(tblRace.RaceID) AS CountOfRaceID,
tblMember.MemberName, tblPoint.Points
FROM ((tblRaceType INNER JOIN tblPoint ON tblRaceType.RaceTypeID = tblPoint.RaceTypeID)
INNER JOIN tblRace ON tblRaceType.RaceTypeID = tblRace.RaceTypeID) INNER JOIN
(tblMember INNER JOIN tblResult ON tblMember.MemberID = tblResult.MemberID) ON
tblRace.RaceID = tblResult.RaceID
GROUP BY tblMember.MemberName, tblPoint.Points
ORDER BY tblPoint.Points DESC;
Is anyone able to point me in the right direction at all?
I'd say this
GROUP BY tblMember.MemberName, tblPoint.Points
ORDER BY tblPoint.Points DESC;
should probably be more like this:
GROUP BY tblMember.MemberName
ORDER BY Sum(tblPoint.Points) DESC;
Also, remove tblPoint.Points at the end of your select. This is just a single point value, you want the sum.
Grouping by points means that you'll get one row per member and point value they scored - probably not what you intended.