I have this query:
select id, convert(nvarchar(10), pubdate, 102) as pubdate,
channel_title, title, description, link, vertinimas
from table1
where statusid > 0
and channel_title = 'channel1'
group by title
order by pubdate desc
to exclude duplicate entries in the field "title" i added group by title in the end, but an error occurs:
"is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."
GROUP BY clause can only be used with aggregate functions like count(), min(), max(), sum() etc. The select query can only select the columns which are part of GROUP BY clause or on which you are applying an aggregate function.
For example you have a STUDENT table like below:
ID
NAME
SUBJECT
MARKS
1
FOO
ENGLISH
80
2
FOO
MATH
70
3
BAR
ENGLISH
100
4
BAR
MATH
50
5
ZIL
ENGLISH
90
6
ZIL
MATH
75
you can write a query like:
SELECT NAME, SUM(MARKS) AS TOTAL FROM STUDENT GROUP BY NAME;
Hear in the above query NAME is part of your GROUP BY clause and we are applying sum() aggregate function on column on MARKS. This will give us a result like below:
NAME
MARKS
FOO
150
BAR
150
ZIL
165
In your query above in the post, only title is part of GROUP BY column. Rest all the column like id, pubdate, channel_title, title, description, link, vertinimas, they are neither part of GROUP BY clause nor passed as a parameter in any aggregate function.
If you want to find / exclude / delete duplicate rows, you can checkout this blog post. This guy has explained it pretty well. Here is the like to find and delete duplicate records!
Related
I am querying a Presto table where I want to calculate what percentage of the total a certain subset of the rows account for.
Consider a table like this:
id
m
1
5
1
7
2
9
3
8
I want to query to report how much of the total measure (m) is contributed by each id. In this example, the total of the measure column is 29 can I find it with a query like...
SELECT SUM("m") FROM t;
output:
sqlite> SELECT SUM("m") FROM t;
29
Then I want to subtotal by id for some of the ids like
SELECT "id", SUM("m") AS "sub_total" FROM t WHERE "id" IN ('1','3') GROUP BY id;
output:
sqlite> SELECT "id", SUM("m") AS "sub_total" FROM t WHERE "id" IN ('1','3') GROUP BY id;
1|12
3|8
Now I want to add a third column where the subtotals are divided by the grand total (29) to get the percentage for each selected id.
I tried:
sqlite>
WITH a AS (
SELECT SUM("m") AS g FROM t )
SELECT "id", SUM("m") AS "sub_total", SUM(m)*100/"a"."g"
FROM a, t
WHERE "t"."id" IN ('1','3') GROUP BY "t"."id";
output:
1|12|41
3|8|27
Which is all good in SQLLite3! But when I translate this to my actual Presto DB (and the right tables and columns), I get this error:
presto error: line 10:5: 'a.g' must be an aggregate expression or appear in GROUP BY clause
I can't understand what I'm missing here or why this would be different in Presto.
When you have a GROUP BY in your query, all expressions that the query is returning must be either:
the expression you are grouping by
or aggregate function
For example if you do GROUP BY id, the resulting query will return one row per id - you cannot just use m, because with id = 1 there are two values: 5 and 7 - so what should be returned? First value, last, sum, average? You need to tell it using aggregate function like sum(m).
Same with a.g - you need to add it to GROUP BY.
WITH a AS (
SELECT SUM("m") AS g FROM t )
SELECT "id", SUM("m") AS "sub_total", SUM(m)*100/"a"."g"
FROM a, t
WHERE "t"."id" IN ('1','3') GROUP BY "t"."id", "a"."g";
There's nothing special about PrestoDB here, it's more SQLite that's less strict, actually most other database engines would complain about your case.
I have a ERD. But I want to write a sql query.
The meaning is that you can select all columns of artgrp of regroupid 11 grouped by artdept.
I have this:
Select *
From artgrp
Where regroudid = "11"
Group by artdept;
My question is: how can I write: select all columns of artgrp group by the columns of artdept?
Here is my model
SELECT d.description, d.lifetime, d.name, COUNT(d.artdeptid) as Departments
FROM artgrp g INNER JOIN arddept d ON g.artdeptid = d.artdeptid
WHERE regroudid = "11"
GROUP BY d.description, d.lifetime, d.name
Group by is used to identify the count, max, min, avg, etc in a cluster of data. To be more clear say for instance you have Cars table with fields make, color, price. And you want to see the count of cars in different colors(this will be cluster of different colors) you can use the following query
select count(1),color from cars group by color;
Output will look like this
Blue 3
Grey 17
Red 5
Note: whatever column you use in group by will be used in select columns as well. In the above example I grouped by using color if add more fields say for instance make it will have two clusters(color,make) output would be
Ford Blue 3
Ford Grey 7
Honda Red 5
Honda Grey 10
So you can identify what is the function you need to perform before grouping your data like count, min, max, avg, rank etc. And if you want all the fields in your select clause you will have to use analytical function. Edit your question with sample data also with expected output, I can give you the answer with analytical query as well if required.
Thanks for editing your question sample data will be more clear, still I will go ahead with what I understood. I am using analytical function here as solution
Select artgrpid,description,descid,relgroupid,
artdeptid,default,name,brand,questions,
ROW_NUMBER() over (Prtition by artdeptid order by artdeptid) from
artgrp;
you can use rank() or count(1) etc instead of ROW_NUMBER().
I'm a newbie programmer, I want to sum a value of employee's attendance record
Anyway, what should I choose? COUNT or SUM?
I tried to use COUNT functions like this...
SELECT COUNT(jlh_sakit) AS sakit FROM rekap_absen
It shows value changed to "1" for 1 Record only.
And I try to use SUM functions like this...
SELECT SUM(jlh_sakit) AS sakit FROM rekap_absen
It shows all values changed ALL value to "1"
I want to display only 1 person for each sum
(e.g : John (2 sick, 2 permissions, 1 alpha)
Can you help me please?
If you are using any aggregate function like min/max/sum/count you should use group by. Now your question says "what should I choose? COUNT or SUM?" Assuming you have person_name, jlh_sakit which means sick/permission/alpha in your case you could use
select person_name, count(jhl_sakit) as attribute
from rekap_absen
group by person
This will give you output like:
person_name attribute
John 2
King 5
In order to sum by specified column, use group by statement.
SELECT SUM(sick),SUM(alpha),SUM(permissions),person FROM rekap_absen group by person
It will group your sums according to person.
You may name your sums like:
SELECT SUM(sick) as sick,SUM(alpha) as alpha,SUM(permissions) as permissions,person FROM rekap_absen group by person
Assuming that you have table rekap_absen with columns: person,sick,alpha,permissions
Basically I have so far managed to produce this:
sqlite> SELECT city, COUNT(bname) AS sum FROM building GROUP BY city;
city sum
---------- ----------
Leeds 3
London 1
New York 2
Paris 1
However what I want to do is only print the Cities that have a sum > 1.
I have tried this:
sqlite> SELECT city, COUNT(bname) AS sum FROM building WHERE sum>1 GROUP BY city;
But I get the error:
Error: misuse of aggregate: COUNT()
Can someone explain why this isn't working, an what I shoud do instead?
Thanks
When you want to limit your resultset based on an aggregate function you use the HAVING clause instead of where:
SELECT city,
COUNT(bname) AS sum
FROM building
GROUP BY city
HAVING COUNT(bname) > 1;
There are good explanations on stackoverflow on why you can't use WHERE, but here's a short one:
Here's what happens when you do a SELECT:
FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
(This list can be bigger in some RDBMS, but this should give you an idea)
Because you are doing a GROUP BY to get your COUNT, you can't limit that on WHERE clause because it already happened.
You need to use having when filtering based on the result of an aggregate function
SELECT city, COUNT(bname) AS sum
FROM building
GROUP BY city
HAVING COUNT(bname) > 1;
I have table with columns as id,title,relation_key. I wanted to get count(*) as well as title for correspondingrelation_key column.
My table contains the following data:
id title relation_key
55 title1111 10
56 title2222 10
57 MytitleVVV 20
58 MytitlleXXX 20
I tried:
select title,count(*) from table where relation_key=10 group by title
But its returning 1 row only. I want both records of title for relation_key=10
You probably want something along these lines:
select title, count(*) over (partition by relation_key)
from table
where relation_key = 10
The result of this would yield:
title | count
----------+------
title1111 | 2
title2222 | 2
Note that you cannot select fields that are not part of the GROUP BY clause in Oracle (as in most other databases).
As a general rule of thumb, you should avoid grouping if you don't really want to group data, but just use aggregate functions such as count(*). Most of Oracle's aggregate functions can be transformed into window functions by adding an over() clause, removing the need for a GROUP BY clause.
If you are getting an Error then Please try with following.
select title,count(*) from table where relation_key=10 group by title,relation_key