Subquery requiring group by without single row function - sql

From my understanding queries that rely on one or more aggregate functions as well as at least one single row function require the single row functions to be placed
in a group by clause, which makes sense overall.
However I'm working through problems in an online resource and ran into the question in the picture, my logic behind why I answered it executes successfully but gives improper output is that the subquery is a query that has only an aggregate function, leaving me to believe that it requires no group by, why is it that this requires a group by in the subquery?

Already cleared by Gordon, "Nested aggregate requires a GROUP BY clause". If we consider the query into 2 parts, first part works fine if Having is given with specific value.
Example:
Run Queries in this link:
https://livesql.oracle.com/apex/f?p=590:1:104596775146183::NO:RP::
select count(*), PROD_CATEGORY_ID from SH.PRODUCTS group by PROD_CATEGORY_ID
having count(*)>15;
But we get error if we combine 2 aggregate functions,
select max(count(PROD_CATEGORY_ID)) from SH.PRODUCTS ; --> Throws ORA-00978
select max(count(PROD_CATEGORY_ID)) from SH.PRODUCTS
group by PROD_CATEGORY_ID; -->Gives max count of prod_cat
Gives final result:
select count(*), PROD_CATEGORY_ID from SH.PRODUCTS
group by PROD_CATEGORY_ID
having count(*)=(select max(count(*)) from SH.PRODUCTS group by PROD_CATEGORY_ID);
Good Examples in link:
https://mahtodeepak05.wordpress.com/2014/12/17/aggregate-function-nesting-in-oracle/

You can easily test this:
select max(count(*))
from dual;
The error is:
ORA-00978: nested group function without GROUP BY
So, a nested group by seems to require a GROUP BY.

Related

Can LAG be used with HAVING?

I distinctly recall that T-SQL will never let you mix LAG and WHERE. For example,
SELECT FOO
WHERE LAG(BAR) OVER (ORDER BY DATE) > 7
will never work. T-SQL will not run it no matter what you do. But does T-SQL ever let you mix LAG with HAVING?
Note: All that an answer needs to do is either give a theory-based or documentation-based reason why it does not, or give any example at all of where it does.
From Logical Processing Order of the SELECT statement:
The following steps show the logical processing order, or binding
order, for a SELECT statement......
FROM
ON
JOIN
WHERE
GROUP BY
WITH CUBE or WITH ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP
Window functions are evaluated at the level of SELECT, which comes after HAVING, so the answer is no you can't use window functions in the HAVING clause.
Having clause can only be used with Group by clause. In order to use Group by the listed columns should be aggregated using Group by columns. Group by can only be used with aggregate functions like min,max,sum,count functions. Hence it is not possible to combine having clause along with the LAG analytical function.
In order to use LAG and Having, one should use CTE or subquery.

what is aggregate function in sql?

I have two queries both are working fine when they executed separately:
select distinct
style_ref
from
tbl_Size
where
order_ref='123'
select
sum(quantity)
from
tbl_size
where
order_ref='123'
But if I try to combine them it does not work
select distinct
style_ref, sum(quantity)
from
tbl_size
where
order_ref='123'
ERROR appears:
Column 'tbl_Size.style_ref' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
An aggregate function is one that combines several records into a single one. In your case, SUM. You're taking the sum, clearly, of more than one row at a time. Another example might be AVG, to get the average of several values.
You can't run aggregate functions, as your error says, alongside ungrouped columns, because that introduces multiple "layers" of data. In one row, you'd have something that described the entire dataset, and you'd have something else that described only a single record. This would be confusing, not to mention inefficient.
Rather than using DISTINCT in your example, you're probably looking to GROUP BY your column:
SELECT style_ref, sum(quantity)
FROM tbl_size
WHERE order_ref='123'
GROUP BY style_ref
This will group up every set of records, based on their style_ref value, then tell you the sum of the quantities. Thus, assuming your schema naming is accurate, it will tell you how many orders were present for each style_ref.
The above query is equivalent in meaning to the following:
SELECT DISTINCT style_ref, (SELECT SUM(quantity)
FROM tbl_size AS B
WHERE B.order_ref = '123'
AND B.style_ref = tbl_size.style_ref)
FROM tbl_size
WHERE order_ref = '123'
As you can see, the GROUP BY solution is much, much cleaner and better to use. But I included this just to describe what it returns in a arguably a bit more of a readable way. You can see here how the aggregate function (SUM) could be described as working on a separate plane from the style_ref column, so it'd be hard to combine those into a single one without GROUP BY.
An aggregate function is a function that returns one result for many rows - like sum in your example.
You can use them in conjunction with the group by clause in order to get one result per group:
select style_ref, sum(quantity)
from tbl_size
where order_ref='123'
group by style_ref

Select all columns on a group by throws error

I ran a query against Northwind database Products Table like below
select * from Northwind.dbo.Products GROUP BY CategoryID and i was hit with a error. I am sure you will also be hit by same error. So what is the correct statement that i need to execute to group all products with respect to their category id's.
edit: this like really helped understand a lot
http://weblogs.sqlteam.com/jeffs/archive/2007/07/20/but-why-must-that-column-be-contained-in-an-aggregate.aspx
You need to use an Aggregate function and then group by any non-aggregated columns.
I recommend reading up on GROUP BY.
If you're using GROUP BY in a query, all items in your SELECT statement must either be contained as part of an aggregate function, e.g. Sum() or Count(), else they will also need to be included in the GROUP BY clause.
Because you are using SELECT *, this is equivalent to listing ALL columns in your SELECT.
Therefore, either list them all in the GROUP BY too, use aggregating functions for the rest where possible, or only select the CategoryID.

SQL statistical authorization

I am trying to understand how statistical authorization works in SQL.
The query result must be a single
aggregate value.
This means you may only use SUM(salary) or COUNT(*) in the select list. If the ID was included, the individual employees can be identified.
At least 3 different tuples should be
used in the aggregate to produce each
query result
You can include a HAVING clause like this:
having count(distinct ID) >= 3
I don't understand the rest of the question.

pgSQL query error

i tried using this query:
"SELECT * FROM guests WHERE event_id=".$id." GROUP BY member_id;"
and I'm getting this error:
ERROR: column "guests.id" must appear in the GROUP BY clause or be used in an aggregate function
can anyone explain how i can work around this?
You can't Group By without letting the Select know what to take, and how to group.
Try
SELECT guests.member_id FROM guests WHERE event_id=".$id." GROUP BY member_id;
IF you need to get more info from this table about the guests, you'll need to add it to the Group By.
Plus, it seems like your select should actually be
SELECT guests.id FROM guests WHERE event_id=".$id." GROUP BY id;
Each of the columns used in a group by query needs to be specifically called out (ie, don't do SELECT * FROM ...), as you need to use them in some sort of aggregate function (min/max/sum/avg/count/etc) or be part of the group by clause.
For example:
SELECT instrument, detector, min(date_obs), max(date_obs)
FROM observations
WHERE observatory='SOHO'
GROUP BY instrument, detector;