select the max of averages with SQL - sql

I'm trying to get the max of averages by using this query :
select code, avg(note)
from exam
group by code
having avg(note)=(select max(avg(note)) from exam group by code)
but I get this error :
Invalid use of group function
where I had wrong?

Nesting aggregation function at the same level will not work. Altenative approach is to order by avg descending and take first row:
SELECT code, avg(note) AS avg_note
FROM exam
GROUP BY code
ORDER BY avg_note DESC
LIMIT 1;

Related

MAX in Select statement not returning the highest value?

I have a question regarding the max-statement in a select -
Without the MAX-statemen i have this select:
SELECT stockID, DATE, close, symbol
FROM ta_stockprice JOIN ta_stock ON ta_stock.id = ta_stockprice.stockID
WHERE stockid = 8648
ORDER BY close
At the end i only want to have the max row for the close-column so i tried:
Why i didnĀ“t get date = "2021-07-02" as output?
(i saw that i allways get "2021-07-01" as output - no matter if i use MAX / MIN / AVG...)
The MAX() turns the query into an aggregation query. With no GROUP BY, it returns one row. But the query is syntactically incorrect, because it mixes aggregated and unaggregated columns.
Once upon a time, MySQL allowed such syntax in violation of the SQL Standard but returned values from arbitrary rows for the unaggreged columns.
Use ORDER BY to do what you want:
SELECT stockID, DATE, close, symbol
FROM ta_stockprice JOIN ta_stock ON ta_stock.id = ta_stockprice.stockID
WHERE stockid = 8648
ORDER BY close DESC
LIMIT 1;

Distinct count and group by in HIVE

I am very new to HIVE and have an issue with distinct count and GROUP BY.
I want to calculate maximum temperature from temperature_data table corresponding to those years which have at least 2 entries in the table-
I tried with below query but it is not working
select
SUBSTRING(full_date,7,4) as year,
MAX(temperature) as temperature
from temperature_data
where count(distinct(SUBSTRING(full_date,7,4))) >= 2
GROUP BY SUBSTRING(full_date,7,4);
I am getting an error-
FAILED: SemanticException [Error 10128]: Line 2:0 Not yet supported place for UDAF 'count'
Below is input-
year,zip,temperature
10-01-1990,123112,10
14-02-1991,283901,11
10-03-1990,381920,15
10-01-1991,302918,22
12-02-1990,384902,9
10-01-1991,123112,11
14-02-1990,283901,12
10-03-1991,381920,16
10-01-1990,302918,23
12-02-1991,384902,10
10-01-1993,123112,11
You should use HAVING keyword instead to set a condition on variable you're using for grouping.
Also, you can benefit of using subqueries. See below.
SELECT
year,
MAX(t1.temperature) as temperature
FROM
(select SUBSTRING(full_date,7,4) year, temperature from temperature_data) t1
GROUP BY
year
HAVING
count(t1.year) > 2;
#R.Gold, We can try to simplify the above query without using sub-query as below:
SELECT substring(full_date,7) as year, max(temperature)
FROM your-hive-table
GROUP BY substring(full_date,7)
HAVING COUNT(substring(full_date,7)) >= 2
And, fyi - we can't use aggregate functions with WHERE clause.

SQL Query not working

I seem to be getting this error while trying to run the below query:
SELECT
to_char(EFFECTIVE_DT,'YYYY-MM') as YYYYMM,
--EFFECTIVE_DT,
AH01_PAYMENT_STATUS_CTD,
TSYS_ACCT_ID
FROM OIS_TSYS.AH_CYCLE_HIST
WHERE 1=1
AND EFFECTIVE_DT BETWEEN '01-MAY-2017' AND '31-MAY-2017'
GROUP BY 2
ORDER BY 1
error: ORA-00979: not a GROUP BY expression
I am trying to group by date as at the moment i get the results daily for each individual account.
Result set:
65589 N 03-MAY-17
65590 S 03-MAY-17
65591 M 03-MAY-17
65592 F 03-MAY-17
65617 G 03-MAY-17
Any help be amazing.
Best,
Saad
When you "group by 2", all other columns must have an aggregate function like (sum, avg, min, max,..)
The "1=1" is pretty useless
To get the desired result use the below query:
When you apply group by clause in any query you cannot just put one column in the group by clause if there are more than one colum in the select clause apart from the aggregate functions like sum, count, min, max etc. So in your case you have to put all the three columns in group by that you selected in the select clause.
SELECT
TSYS_ACCT_ID,
AH01_PAYMENT_STATUS_CTD,
to_char(EFFECTIVE_DT,'YYYY-MM') as YYYYMM
FROM OIS_TSYS.AH_CYCLE_HIST
WHERE EFFECTIVE_DT BETWEEN '01-MAY-2017' AND '31-MAY-2017'
GROUP BY
TSYS_ACCT_ID,
AH01_PAYMENT_STATUS_CTD,
to_char(EFFECTIVE_DT,'YYYY-MM')
ORDER BY 1

Not a group by function at a cumulative query

I'm making a cumulative query, which shows the evolution of clients in my database. To get these query, I use the year and the week of year they joined in the client database.
I have following query to search for relevant data:
SELECT DD.CAL_YEAR, DD.WEEK_OF_YEAR, SUM(COUNT(DISTINCT FAB.ID)) OVER ( ORDER BY DD.CAL_DATE ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS "Number of account statements"
FROM CLIENT_DATABASE FAB
JOIN DIM_DATE DD ON FAB.BALANCE_DATE_ID = DD.ID
GROUP BY DD.CAL_YEAR, DD.WEEK_OF_YEAR;
But when I compile this query, I get following error:
Error: ORA-00979: not a GROUP BY expression
SQLState: 42000 ErrorCode: 979
How can I fix this?
Since you are grouping by DD.CAL_YEAR, DD.WEEK_OF_YEAR, you can't use DD.CAL_DATE in the order by clause of your cumulative sum function.
It's hard for me to say exactly what you are trying to do without fully understanding your data. But, logically, it does seem like you should be able to simply use DD.CAL_YEAR, DD.WEEK_OF_YEAR in the order by clause instead of DD.CAL_DATE, and still get the results the way you are expecting.
So something like this:
SUM(COUNT(DISTINCT FAB.ID)) OVER ( ORDER BY D.CAL_YEAR, DD.WEEK_OF_YEAR ...

Using a timestamp function in a GROUP BY

I'm working with a large transaction data set and would like to group a count of individual customer transactions by month. I am unable to use the timestamp function in the GROUP BY and return the following error:
BAD_QUERY (expression STRFTIME_UTC_USEC([DATESTART], '%b') in GROUP BY is invalid)
Is there a simple workaround to achieve this or should I build a calendar table (which may be the simplest option)?
You have to use an alias:
SELECT STRFTIME_UTC_USEC(DATESTART, '%b') as month, COUNT(TRANSACTION)
FROM datasetId.tableId
GROUP BY month
#Charles is correct but as an aside you can also group by column number.
SELECT STRFTIME_UTC_USEC(DATESTART, '%b') as month, COUNT(TRANSACTION) as count
FROM [datasetId.tableId]
GROUP BY 1
ORDER BY 2 DESC