I'm trying to select duplicates from this table:
snr zip
01 83
02 82
03 43
04 28
Expected result is just empty table. Cuz it got no duplicates.
I've tried with this query:
SELECT snr, zip
FROM student
GRUOP BY snr
HAVING (COUNT(zip) > 1)
But it says that syntax is error, and that I'm trying to query an aggregate functions, etc..
It looks like you need to either remove zip from the SELECT columns, or else wrap it in an aggregate function, such as COUNT(zip):
SELECT snr, COUNT(zip)
FROM student
GROUP BY snr
HAVING (COUNT(zip) > 1)
Also check out #OMG Ponies's answer for further suggestions.
Use:
SELECT snr, zip
FROM student
GROUP BY snr, zip
HAVING COUNT(DISTINCT zip) > 1
Standard SQL requires that columns in the SELECT clause that are not wrapped in aggregate functions (COUNT, MIN, MAX, etc) need to be defined in the GROUP BY. However, MySQL and SQLite allow for columns to be omitted.
Additionally, use COUNT(DISTINCT or you'll risk false positives.
Related
I have this query:
select id, convert(nvarchar(10), pubdate, 102) as pubdate,
channel_title, title, description, link, vertinimas
from table1
where statusid > 0
and channel_title = 'channel1'
group by title
order by pubdate desc
to exclude duplicate entries in the field "title" i added group by title in the end, but an error occurs:
"is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."
GROUP BY clause can only be used with aggregate functions like count(), min(), max(), sum() etc. The select query can only select the columns which are part of GROUP BY clause or on which you are applying an aggregate function.
For example you have a STUDENT table like below:
ID
NAME
SUBJECT
MARKS
1
FOO
ENGLISH
80
2
FOO
MATH
70
3
BAR
ENGLISH
100
4
BAR
MATH
50
5
ZIL
ENGLISH
90
6
ZIL
MATH
75
you can write a query like:
SELECT NAME, SUM(MARKS) AS TOTAL FROM STUDENT GROUP BY NAME;
Hear in the above query NAME is part of your GROUP BY clause and we are applying sum() aggregate function on column on MARKS. This will give us a result like below:
NAME
MARKS
FOO
150
BAR
150
ZIL
165
In your query above in the post, only title is part of GROUP BY column. Rest all the column like id, pubdate, channel_title, title, description, link, vertinimas, they are neither part of GROUP BY clause nor passed as a parameter in any aggregate function.
If you want to find / exclude / delete duplicate rows, you can checkout this blog post. This guy has explained it pretty well. Here is the like to find and delete duplicate records!
I've got a query that returns data like so:
student
course
grade
a-student
ENG-W05
100
a-student
MAT-W05
85
a-student
ENG-W06
100
b-student
MAT-W05
90
b-student
SCI-W05
75
The data is grouped by student and course. Ideally, I'd like to have the above data transformed into the below:
student
ENG-W05
MAT-W05
ENG-W06
SCI-W05
a-student
100
85
100
NULL
b-student
NULL
90
NULL
75
So, after the transformation, each student only has one record, with all of their grades (and any missing courses graded as null).
Does anyone have any ideas? Obviously, this is fairly simple to do if I take the data out and transform it in a language (like Python), but I'd love to get the data in the desired format with an SQL query.
Also, would it be possible to have the columns order alphabetically (ascending)? So, the final output would be:
student
ENG-W05
ENG-W06
MAT-W05
SCI-W05
a-student
100
100
85
NULL
b-student
NULL
NULL
90
75
EDIT: To clarify, the values in course aren't known. The ones I provided are just examples. So ideally, if more course values found there way into that first query result (the first table), they would still be mapped to columns in the final result (without needing to change the query). In reality, I actually have >1k distinct values for the course column, and so I can't manually write out each one.
demos:db<>fiddle
You can use conditional aggregation for that:
SELECT
student,
SUM(grade) FILTER (WHERE course = 'ENG-W05') as eng_w05,
SUM(grade) FILTER (WHERE course = 'MAT-W05') as mat_w05,
SUM(grade) FILTER (WHERE course = 'ENG-W06') as eng_w06,
SUM(grade) FILTER (WHERE course = 'SCI-W05') as sci_w05
FROM mytable
GROUP BY student
The FILTER clause allows to aggregate only some specific records. So this one aggregates all records for a specific course.
Finding the correct aggregate function could be difficult. Here SUM() does the job, as there's only one value per group. MAX() or MIN() would do it as well. It depends on your real requirement. If there's really only one value per group, it doesn't matter, you just need to do any aggregation.
Instead of FILTER clause, which is Postgres specific, you could use the more SQL standard fitting CASE clause:
SELECT
student,
SUM(
CASE
WHEN course = 'ENG-W05' THEN grade
END
) AS eng_w05,
...
You can use the conditional aggregation as follows:
select student,
max(case when course = 'ENG-W05' then grade end) as "ENG-W05",
max(case when course = 'MAT-W05' then grade end) as "MAT-W05",
max(case when course = 'ENG-W06' then grade end) as "ENG-W06",
max(case when course = 'SCI-W05' then grade end) as "SCI-W05"
from (your_query) t
group by student
I have the following sql query:
select Judge, ResultIndex, count(*) as CasesForJudge
from SRSIndexes
group by Judge, ResultIndex
The columns "Judge" and "ResultIndex" are nvarchar type. I recieve output like this:
Adelina Andreeva 2a 24
Adelina Andreeva 5b 33
....
Georgy Ivanov 3b 44
Georgy Ivanov 5a 5
I want to find the sums (from "CasesForJudge" column) for each judge (for example: Adelina Andreeva -> 57, Georgy Ivanov -> 49). How should i modify my query?
You just need to GROUP BY your Judge column
SELECT Judge, count(*) AS CasesForJudge
FROM SRSIndexes
GROUP BY Judge
In case you want both groupings in one query you can try grouping sets:
select Judge,
ResultIndex,
count(*) as CasesForJudge
from SRSIndexes
group by grouping sets ((Judge, ResultIndex), -- Initial grouping
(Judge)) -- Added one
I have table with columns as id,title,relation_key. I wanted to get count(*) as well as title for correspondingrelation_key column.
My table contains the following data:
id title relation_key
55 title1111 10
56 title2222 10
57 MytitleVVV 20
58 MytitlleXXX 20
I tried:
select title,count(*) from table where relation_key=10 group by title
But its returning 1 row only. I want both records of title for relation_key=10
You probably want something along these lines:
select title, count(*) over (partition by relation_key)
from table
where relation_key = 10
The result of this would yield:
title | count
----------+------
title1111 | 2
title2222 | 2
Note that you cannot select fields that are not part of the GROUP BY clause in Oracle (as in most other databases).
As a general rule of thumb, you should avoid grouping if you don't really want to group data, but just use aggregate functions such as count(*). Most of Oracle's aggregate functions can be transformed into window functions by adding an over() clause, removing the need for a GROUP BY clause.
If you are getting an Error then Please try with following.
select title,count(*) from table where relation_key=10 group by title,relation_key
I have the following table (highscores),
id gameid userid name score date
1 38 2345 A 100 2009-07-23 16:45:01
2 39 2345 A 500 2009-07-20 16:45:01
3 31 2345 A 100 2009-07-20 16:45:01
4 38 2345 A 200 2009-10-20 16:45:01
5 38 2345 A 50 2009-07-20 16:45:01
6 32 2345 A 120 2009-07-20 16:45:01
7 32 2345 A 100 2009-07-20 16:45:01
Now in the above structure, a user can play a game multiple times but I want to display the "Games Played" by a specific user. So in games played section I can't display multiple games. So the concept should be like if a user played a game 3 times then the game with highest score should be displayed out of all.
I want result data like:
id gameid userid name score date
2 39 2345 A 500 2009-07-20 16:45:01
3 31 2345 A 100 2009-07-20 16:45:01
4 38 2345 A 200 2009-10-20 16:45:01
6 32 2345 A 120 2009-07-20 16:45:01
I tried following query but its not giving me the correct result:
SELECT id,
gameid,
userid,
date,
MAX(score) AS score
FROM highscores
WHERE userid='2345'
GROUP BY gameid
Please tell me what will be the query for this?
Thanks
Requirement is a bit vague/confusing but would something like this satisfy the need ?
(purposely added various aggregates that may be of interest).
SELECT gameid,
MIN(date) AS FirstTime,
MAX(date) AS LastTime,
MAX(score) AS TOPscore.
COUNT(*) AS NbOfTimesPlayed
FROM highscores
WHERE userid='2345'
GROUP BY gameid
-- ORDER BY COUNT(*) DESC -- for ex. to have games played most at top
Edit: New question about adding the id column to the the SELECT list
The short answer is: "No, id cannot be added, not within this particular construct". (Read further to see why) However, if the intent is to have the id of the game with the highest score, the query can be modified, using a sub-query, to achieve that.
As explained by Alex M on this page, all the column names referenced in the SELECT list and which are not used in the context of an aggregate function (MAX, MIN, AVG, COUNT and the like), MUST be included in the ORDER BY clause. The reason for this rule of the SQL language is simply that in gathering the info for the results list, SQL may encounter multiple values for such an column (listed in SELECT but not GROUP BY) and would then not know how to deal with it; rather than doing anything -possibly useful but possibly silly as well- with these extra rows/values, SQL standard dictates a error message, so that the user can modify the query and express explicitly his/her goals.
In our specific case, we could add the id in the SELECT and also add it in the GROUP BY list, but in doing so the grouping upon which the aggregation takes place would be different: the results list would include as many rows as we have id + gameid combinations the aggregate values for each of this row would be based on only the records from the table where the id and the gameid have the corresponding values (assuming id is the PK in table, we'd get a single row per aggregation, making the MAX() and such quite meaningless).
The way to include the id (and possibly other columns) corresponding to the game with the top score, is with a sub-query. The idea is that the subquery selects the game with TOP score (within a given group by), and the main query's SELECTs any column of this rows, even when the fieds wasn't (couldn't be) in the sub-query's group-by construct. BTW, do give credit on this page to rexem for showing this type of query first.
SELECT H.id,
H.gameid,
H.userid,
H.name,
H.score,
H.date
FROM highscores H
JOIN (
SELECT M.gameid, hs.userid, MAX(hs.score) MaxScoreByGameUser
FROM highscores H2
GROUP BY H2.gameid, H2.userid
) AS M
ON M.gameid = H.gameid
AND M.userid = H.userid
AND M.MaxScoreByGameUser = H.score
WHERE H.userid='2345'
A few important remarks about the query above
Duplicates: if there the user played several games that reached the same hi-score, the query will produce that many rows.
GROUP BY of the sub-query may need to change for different uses of the query. If rather than searching for the game's hi-score on a per user basis, we wanted the absolute hi-score, we would need to exclude userid from the GROUP BY (that's why I named the alias of the MAX with a long, explicit name)
The userid = '2345' may be added in the [now absent] WHERE clause of the sub-query, for efficiency purposes (unless MySQL's optimizer is very smart, currently all hi-scores for all game+user combinations get calculated, whereby we only need these for user '2345'); down side duplication; solution; variables.
There are several ways to deal with the issues mentioned above, but these seem to be out of scope for a [now rather lenghty] explanation about the GROUP BY constructs.
Every field you have in your SELECT (when a GROUP BY clause is present) must be either one of the fields in the GROUP BY clause, or else a group function such as MAX, SUM, AVG, etc. In your code, userid is technically violating that but in a pretty harmless fashion (you could make your code technically SQL standard compliant with a GROUP BY gameid, userid); fields id and date are in more serious violation - there will be many ids and dates within one GROUP BY set, and you're not telling how to make a single value out of that set (MySQL picks a more-or-less random ones, stricter SQL engines might more helpfully give you an error).
I know you want the id and date corresponding to the maximum score for a given grouping, but that's not explicit in your code. You'll need a subselect or a self-join to make it explicit!
Use:
SELECT t.id,
t.gameid,
t.userid,
t.name,
t.score,
t.date
FROM HIGHSCORES t
JOIN (SELECT hs.gameid,
hs.userid,
MAX(hs.score) 'max_score'
FROM HIGHSCORES hs
GROUP BY hs.gameid, hs.userid) mhs ON mhs.gameid = t.gameid
AND mhs.userid = t.userid
AND mhs.max_score = t.score
WHERE t.userid = '2345'