SQL: order by how many rows there are (using COUNT?) - sql

For SMF, I'm making a roster for the members of my clan (please don't come with "You should ask SMF", because that is completely irrelevant; this is just contextual information).
I need it to select all members (from smf_members) and order it by how many permissions they have in smf_permissions (so the script can determine who is higher in rank).
You can retrieve how many permissions there are by using: COUNT(permission) FROM smf_permissions.
I am now using this SQL:
SELECT DISTINCT(m.id_member), m.real_name, m.date_registered
FROM smf_members AS m, smf_permissions AS p
WHERE m.id_group=p.id_group
ORDER BY COUNT(p.permission)
However, this only returns one row! How to return several rows?
Cheers,
Aart

You need a GROUP BY. I've also rewritten with explicit JOIN syntax. You might need to change to LEFT JOIN if you want to include members with zero permissions.
SELECT m.id_member,
m.real_name,
m.date_registered,
COUNT(p.permission) AS N
FROM smf_members AS m
JOIN smf_permissions AS p
ON m.id_group = p.id_group
GROUP BY m.id_member,
m.real_name,
m.date_registered
ORDER BY COUNT(p.permission)

Related

When I try to get MIN value of manufactured parts I get (Only one expression ... when the subquery is not introduced with EXISTS)

I try to get MIN value of manufactured parts grouped by project like so:
This is my query:
SELECT
proinfo.ProjectN
,ProjShipp.[Parts]
,ProjShipp.Qty AS 'Qty Total'
,Sum(DailyProduction.Quantity) AS 'Qty Manufactured'
,(SELECT DailySumPoteau.IdProject, MIN(DailySumPoteau.DailySum)
FROM (SELECT PShipp.IdProject, SUM(DailyWelding.Quantity) DailySum
FROM DailyWeldingPaintProduction DailyWelding
INNER JOIN ProjectShipping PShipp ON PShipp.id=DailyWelding.FK_idPartShip
WHERE PShipp.id=ProjShipp.id
GROUP BY PShipp.id,PShipp.IdProject)DailySumPoteau
GROUP BY DailySumPoteau.IdProject ) AS 'Qt Pole'
FROM [dbo].[DailyWeldingPaintProduction] DailyProduction
INNER join ProjectShipping ProjShipp on ProjShipp.id=DailyProduction.FK_idPartShip
inner join ProjectInfo proinfo on proinfo.id=IdProject
GROUP By proinfo.id
,proinfo.ProjectN
,ProjShipp.[Parts]
,ProjShipp.Qty
,ProjShipp.[Designation]
,ProjShipp.id
I have three tables:
01 - ProjectInfo: it stores information about the project:
02 - ProjectShipping: it stores information about the parts and it has ProjectInfoId as foreign key:
03 - DailyWeldingPaintProduction: it stores information about daily production and it has ProjectShippingId as foreign key:
but when I run it I get this error:
Msg 116, Level 16, State 1, Line 13
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
How can I solve this problem?.
From your target results, I suspect that you want a window MIN(). Assuming that your query works and generates the correct results when the subquery is removed (column QtPole left apart), that would be:
SELECT pi.ProjectN, ps.[Parts], ps.Qty AS QtyTotal,
SUM(dp.Quantity) AS QtyManufactured,
MIN(SUM(dp.Quantity)) OVER(PARTITION BY pi.ProjectN) AS QtPole
ps.Designation
FROM [dbo].[DailyWeldingPaintProduction] dp
INNER join ProjectShipping ps on ps.id=dp.FK_idPartShip
INNER join ProjectInfo pi on pi.id=IdProject
GROUP BY pi.id, pi.ProjectN, ps.[Parts], ps.Qty, ps.Designation, ps.id
Side note: don't use single quotes for identifiers; they should be reserved for literal strings only. Use the proper quoting character for your database (in SQL Server: square brackets) - or better yet, use identifiers that do not require being quoted.
Formulating the query in the way you have done is not necessarily the best solution. As the other solution mentions, the best method in this instance is probably to use a window function / OVER. But since this can depend on indexes, and also to understand what went wrong, I will give you the way to fix the original query.
The issue with your query is that it has a correlated subquery in the SELECT which returns two values. What you are trying to do can be done in RDBMSs that support row constructors, unfortunately SQl Server is not one of them.
What you are trying to get at here is to get a whole resultset per row of the table.
The correct syntax for your query is to APPLY the resultset of the subquery for every row. You can CROSS APPLY in this instance because you are guaranteed a result anyway due to the correlation:
SELECT
proinfo.ProjectN
,ProjShipp.[Parts]
,ProjShipp.Qty AS 'Qty Total'
,Sum(DailyProduction.Quantity) AS 'Qty Manufactured'
,QtPole.IdProject
,QtPole.MinDailySum
FROM [dbo].[DailyWeldingPaintProduction] DailyProduction
INNER join ProjectShipping ProjShipp on ProjShipp.id=DailyProduction.FK_idPartShip
inner join ProjectInfo proinfo on proinfo.id=ProjShipp.IdProject
CROSS APPLY (
SELECT DailySumPoteau.IdProject, MIN(DailySumPoteau.DailySum) MinDailySum
FROM (SELECT DailyWelding.FK_idPartShip IdProject, SUM(DailyWelding.Quantity) DailySum
FROM DailyWeldingPaintProduction DailyWelding
WHERE DailyWelding.FK_idPartShip=ProjShipp.id
GROUP BY DailyWelding.FK_idPartShip) DailySumPoteau
GROUP BY DailySumPoteau.IdProject
) AS QtPole
GROUP By proinfo.id
,proinfo.ProjectN
,ProjShipp.[Parts]
,ProjShipp.Qty
,ProjShipp.[Designation]
,ProjShipp.id
,QtPole.IdProject
,QtPole.MinDailySum
I have taken the liberty of cleaning up the subquery by removing the unnecessary ProjectShipping reference. Note that the addition of grouping columns here does not matter because of the correlation to ProjShipp.Id
Note also that depending on indexes and density and such like, it may be better to formulate the subquery as a JOIN instead, with the correlation on the outside in the ON. You would need to modify the grouping in that case.

SQL Inner Join and Grouping

It's been a while since I've done SQL, but I have a rather pressing issue.
My db-layout is as following:
Now, starting from a Users.ID, I want to get all the Rounds the user has played. The user could be Hosts.HostID, Host.GuestID, or even both. Where he is both, it should not show op in the results.
The results I need from the Query are the Hosts.Name and all the fields of Rounds. In general what I want to do is display a list of all the Hosts (actually these are the Games) in which the user has participated, as a Host or as a Guest, along with perhaps a total score. When clicking on this, some dropdown will appear showing the individual round scores, words, ...
Now I was wondering whether this was possible in a single query. Of course I could do a query getting all the Hosts and then per Host a query for each Round, but that doesn't seem that performant. This is what I've come up with so far:
SELECT Rounds.ID, Rounds.GameID, Rounds.Round, Rounds.Score, Rounds.Word
, Hosts.ID, Hosts.HostID, Hosts.GuestID
FROM Rounds INNER JOIN Hosts
ON Rounds.GameID = Hosts.ID
INNER JOIN Users
ON Hosts.hostID = Users.ID
WHERE Users.ID = 5
The issue is however that it doesn't filter out where the user is both host AND guest, and I can't seem to Group it by Hosts.ID either.
Add Hosts.hostID <> Hosts.guestID to the where clause.
If you are using SQL Server 2005 or later version, you could modify your present query like this:
SELECT Rounds.ID, Rounds.GameID, Rounds.Round, Rounds.Score, Rounds.Word
, Hosts.ID, Hosts.HostID, Hosts.GuestID
FROM Rounds INNER JOIN Hosts
ON Rounds.GameID = Hosts.ID
CROSS APPLY
(
SELECT Hosts.HostID
UNION
SELECT Hosts.GuestID
) AS u (UserID)
WHERE u.UserID = 5
;
The CROSS APPLY clause would produce either one or two rows, depending on whether HostID and GuestID are equal. In either event, the WHERE condition would ultimately leave at most one. Thus, the above query would give you only the games (with all their rounds) where the specified user participated.
And from that query you could easily get to something like this:
SELECT Hosts.ID,
TotalScore = SUM(Rounds.Score)
FROM
...
GROUP BY
Hosts.ID
;

Select first or random row in group by

I have this query using PostgreSQL 9.1 (9.2 as soon as our hosting platform upgrades):
SELECT
media_files.album,
media_files.artist,
ARRAY_AGG (media_files. ID) AS media_file_ids
FROM
media_files
INNER JOIN playlist_media_files ON media_files.id = playlist_media_files.media_file_id
WHERE
playlist_media_files.playlist_id = 1
GROUP BY
media_files.album,
media_files.artist
ORDER BY
media_files.album ASC
and it's working fine, the goal was to extract album/artist combinations and in the result set have an array of media files ids for that particular combo.
The problem is that I have another column in media files, which is artwork.
artwork is unique for each media file (even in the same album) but in the result set I need to return just the first of the set.
So, for an album that has 10 media files, I also have 10 corresponding artworks, but I would like just to return the first (or a random picked one for that collection).
Is that possible to do with only SQL/Window Functions (first_value over..)?
Yes, it's possible. First, let's tweak your query by adding alias and explicit column qualifiers so it's clear what comes from where - assuming I've guessed correctly, since I can't be sure without table definitions:
SELECT
mf.album,
mf.artist,
ARRAY_AGG (mf.id) AS media_file_ids
FROM
"media_files" mf
INNER JOIN "playlist_media_files" pmf ON mf.id = pmf.media_file_id
WHERE
pmf.playlist_id = 1
GROUP BY
mf.album,
mf.artist
ORDER BY
mf.album ASC
Now you can either use a subquery in the SELECT list or maybe use DISTINCT ON, though it looks like any solution based on DISTINCT ON will be so convoluted as not to be worth it.
What you really want is something like an pick_arbitrary_value_agg aggregate that just picks the first value it sees and throws the rest away. There is no such aggregate and it isn't really worth implementing it for the job. You could use min(artwork) or max(artwork) and you may find that this actually performs better than the later solutions.
To use a subquery, leave the ORDER BY as it is and add the following as an extra column in your SELECT list:
(SELECT mf2.artwork
FROM media_files mf2
WHERE mf2.artist = mf.artist
AND mf2.album = mf.album
LIMIT 1) AS picked_artwork
You can at a performance cost randomize the selected artwork by adding ORDER BY random() before the LIMIT 1 above.
Alternately, here's a quick and dirty way to implement selection of a random row in-line:
(array_agg(artwork))[width_bucket(random(),0,1,count(artwork)::integer)]
Since there's no sample data I can't test these modifications. Let me know if there's an issue.
"First" pick
Wouldn't it be simpler / cheaper to just use min():
SELECT m.album
,m.artist
,array_agg(m.id) AS media_file_ids
,min(m.artwork) AS artwork
FROM playlist_media_files p
JOIN media_files m ON m.id = p.media_file_id
WHERE p.playlist_id = 1
GROUP BY m.album, m.artist
ORDER BY m.album, m.artist;
Abitrary / random pick
If you are looking for a random selection, #Craig already provided a solution with truly random picks.
You could also use a CTE to avoid additional scans on the (possibly big) base table and then run two separate (cheap) subqueries on the small result set.
For arbitrary selection - not truly random, the result will depend on the physical order of rows in the table and implementation-specifics:
WITH x AS (
SELECT m.album, m.artist, m.id, m.artwork
FROM playlist_media_files p
JOIN media_files m ON m.id = p.media_file_id
)
SELECT a.album, a.artist, a.media_file_ids, b.artwork
FROM (
SELECT album, artist, array_agg(id) AS media_file_ids
FROM x
) a
JOIN (
SELECT DISTINCT ON (1,2) album, artist, artwork
FROM x
) b USING (album, artist);
For truly random results, you can add an ORDER BY .. random() like this to subquery b:
JOIN (
SELECT DISTINCT ON (1, 2) album, artist, artwork
FROM x
ORDER BY 1, 2, random()
) b USING (album, artist);

SQL SELECT query with JOIN, SUM and GROUP BY

I have 5 tables in a MS Access databse: tblMember, tblPoint, tblRace, tblRaceType and tblResult. (All of which have primary keys.)
tblPoint contains (RaceTypeID, Position, Points) fields.
What I want to do is look at all the races that the members participated in, see what position they came (stored in tblResult) and see if those positions score points (as defined in tblPoint). I then want to add up all the points for each member and show these, along with the member name in my query...
Is this possible? I came up with my best shot at this SQL query below:
SELECT Sum(tblPoint.Points) AS SumOfPoints, Count(tblRace.RaceID) AS CountOfRaceID,
tblMember.MemberName, tblPoint.Points
FROM ((tblRaceType INNER JOIN tblPoint ON tblRaceType.RaceTypeID = tblPoint.RaceTypeID)
INNER JOIN tblRace ON tblRaceType.RaceTypeID = tblRace.RaceTypeID) INNER JOIN
(tblMember INNER JOIN tblResult ON tblMember.MemberID = tblResult.MemberID) ON
tblRace.RaceID = tblResult.RaceID
GROUP BY tblMember.MemberName, tblPoint.Points
ORDER BY tblPoint.Points DESC;
Is anyone able to point me in the right direction at all?
I'd say this
GROUP BY tblMember.MemberName, tblPoint.Points
ORDER BY tblPoint.Points DESC;
should probably be more like this:
GROUP BY tblMember.MemberName
ORDER BY Sum(tblPoint.Points) DESC;
Also, remove tblPoint.Points at the end of your select. This is just a single point value, you want the sum.
Grouping by points means that you'll get one row per member and point value they scored - probably not what you intended.

MySQL to return only last date / time record

We have a database that stores vehicle's gps position, date, time, vehicle identification, lat, long, speed, etc., every minute.
The following select pulls each vehicle position and info, but the problem is that returns the first record, and I need the last record (current position), based on date (datagps.Fecha) and time (datagps.Hora). This is the select:
SELECT configgps.Fichagps,
datacar.Ficha,
groups.Nombre,
datagps.Hora,
datagps.Fecha,
datagps.Velocidad,
datagps.Status,
datagps.Calleune,
datagps.Calletowo,
datagps.Temp,
datagps.Longitud,
datagps.Latitud,
datagps.Evento,
datagps.Direccion,
datagps.Provincia
FROM asigvehiculos
INNER JOIN datacar ON (asigvehiculos.Iddatacar = datacar.Id)
INNER JOIN configgps ON (datacar.Configgps = configgps.Id)
INNER JOIN clientdata ON (asigvehiculos.Idgroup = clientdata.group)
INNER JOIN groups ON (clientdata.group = groups.Id)
INNER JOIN datagps ON (configgps.Fichagps = datagps.Fichagps)
Group by Fichagps;
I need same result I'm getting, but instead of the older record I need the most recent
(LAST datagps.Fecha / datagps.Hora).
How can I accomplish this?
Add ORDER BY datagps.Fecha DESC, datagps.Hora DESC LIMIT 1 to your query.
I'm not sure why you are having any problems with this as Lex's answers seem good.
I would start putting ORDER BY's in your query so it puts them in an order, when it's showing the record you want as the first one in the list, then add the LIMIT.
If you want the most recent, then the following should be good enough:
ORDER BY datagps.Fecha DESC, datagps.Hora DESC
If you simply want the record that was added to the database most recently (irregardless of the date/time fields), then you could (assuming you have an auto-incremental primary key in the datagps table (I assume it's called dataID for this example)):
ORDER BY datagps.dataID DESC
If these aren't showing the data you want - then there is something missing from your example (maybe data-types aren't DATETIME fields? - if not - then maybe a CONVERT to change them from their current type before ORDERing BY would be a good idea)
EDIT:
I've seen the screenshot and I'm confused as to what the issue is still. That appears to be showing everything in order. Are you implying that there are many more than 5 records? How many are you expecting?
Do you mean: for each record returned, you want the one row from the table datagps with the latest date and time attached to the result? If so, how about this:
# To show how the query will be executed
# comment to return actual results
EXPLAIN
SELECT
configgps.Fichagps, datacar.Ficha, groups.Nombre, datagps.Hora, datagps.Fecha,
datagps.Velocidad, datagps.Status, datagps.Calleune, datagps.Calletowo,
datagps.Temp, datagps.Longitud, datagps.Latitud, datagps.Evento,
datagps.Direccion, datagps.Provincia
FROM asigvehiculos
INNER JOIN datacar ON (asigvehiculos.Iddatacar = datacar.Id)
INNER JOIN configgps ON (datacar.Configgps = configgps.Id)
INNER JOIN clientdata ON (asigvehiculos.Idgroup = clientdata.group)
INNER JOIN groups ON (clientdata.group = groups.Id)
INNER JOIN datagps ON (configgps.Fichagps = datagps.Fichagps)
########### Add this section
LEFT JOIN datagps b ON (
configgps.Fichagps = b.Fichagps
# wrong condition
#AND datagps.Hora < b.Hora
#AND datagps.Fecha < b.Fecha)
# might prevent indexes to be used
AND (datagps.Fecha < b.Fecha OR (datagps.Fecha = b.Fecha AND datagps.Hora < b.Hora))
WHERE b.Fichagps IS NULL
###########
Group by configgps.Fichagps;
Similar question here only that that one uses outer joins.
Edit (again):
The conditions are wrong so corrected it. Can you show us the output of the above EXPLAIN query so we can pinpoint where the bottle neck is?
As hurikhan77 said, it will be better if you could convert both of the the columns into a single datetime field - though I'm guessing this would not be possible for your case (since your database is already being used?)
Though if you can convert it, the condition (on the join) would become:
AND datagps.FechaHora < b.FechaHora
After that, add an index for datagps.FechaHora and the query would be fast(er).
What you probably want is getting the maximum of (Fecha,Hora) per grouped dataset? This is a little complicated to accomplish with your column types. You should combine Fecha and Hora into one column of type DATETIME. Then it's easy to just SELECT MAX(FechaHora) ... GROUP BY Fichagps.
It could have helped if you posted your table structure to understand the problem.