Aggregate function in SQL WHERE-Clause - sql

In a test at university there was a question; is it possible to use an aggregate function in the SQL WHERE clause.
I always thought this isn't possible and I also can't find any example how it would be possible. But my answer was marked false and now I want to know in which cases it is possible to use an aggregate function in the WHERE. Also if it isn't possible it would be nice to get a link to the specification where it is described.

HAVING is like WHERE with aggregate functions, or you could use a subquery.
select EmployeeId, sum(amount)
from Sales
group by Employee
having sum(amount) > 20000
Or
select EmployeeId, sum(amount)
from Sales
group by Employee
where EmployeeId in (
select max(EmployeeId) from Employees)

You haven't mentioned the DBMS. Assuming you are using MS SQL-Server, I've found a T-SQL Error message that is self-explanatory:
"An aggregate may not appear in the
WHERE clause unless it is in a
subquery contained in a HAVING clause
or a select list, and the column being
aggregated is an outer reference"
http://www.sql-server-performance.com/
And an example that it is possible in a subquery.
Show all customers and smallest order for those who have 5 or more orders (and NULL for others):
SELECT a.lastname
, a.firstname
, ( SELECT MIN( o.amount )
FROM orders o
WHERE a.customerid = o.customerid
AND COUNT( a.customerid ) >= 5
)
AS smallestOrderAmount
FROM account a
GROUP BY a.customerid
, a.lastname
, a.firstname ;
UPDATE.
The above runs in both SQL-Server and MySQL but it doesn't return the result I expected. The next one is more close. I guess it has to do with that the field customerid, GROUPed BY and used in the query-subquery join is in the first case PRIMARY KEY of the outer table and in the second case it's not.
Show all customer ids and number of orders for those who have 5 or more orders (and NULL for others):
SELECT o.customerid
, ( SELECT COUNT( o.customerid )
FROM account a
WHERE a.customerid = o.customerid
AND COUNT( o.customerid ) >= 5
)
AS cnt
FROM orders o
GROUP BY o.customerid ;

You can't use an aggregate directly in a WHERE clause; that's what HAVING clauses are for.
You can use a sub-query which contains an aggregate in the WHERE clause.

UPDATED query:
select id from t where id < (select max(id) from t);
It'll select all but the last row from the table t.

SELECT COUNT( * )
FROM agents
HAVING COUNT(*)>3;
See more below link:
http://www.w3resource.com/sql/aggregate-functions/count-having.php#sthash.90csRM4I.dpuf]
http://www.w3resource.com/sql/aggregate-functions/count-having.php

Another solution is to Move the aggregate fuction to Scalar User Defined Function
Create Your Function:
CREATE FUNCTION getTotalSalesByProduct(#ProductName VARCHAR(500))
RETURNS INT
AS
BEGIN
DECLARE #TotalAmount INT
SET #TotalAmount = (select SUM(SaleAmount) FROM Sales where Product=#ProductName)
RETURN #TotalAmount
END
Use Function in Where Clause
SELECT ProductName, SUM(SaleAmount) AS TotalSales
FROM Sales
WHERE dbo.getTotalSalesByProduct(ProductName) > 1000
GROUP BY Product
References:
1.
2.
Hope helps someone.

If you are using an aggregate function in a where clause then it means you want to filter data on the basis of that aggregation function. In my case, it's SUM(). I'll jump to the solution.
(select * from(select sum(appqty)summ,oprcod from pckwrk_view group by oprcod)AS asd where summ>500)
The inner query is used to fetch results that need to be filtered.
The aggregate function which has to filter out must be given an ALIAS name because the actual name of the column inside an aggregate function is not accessible or recognized by the outer query.
Finally, the filter can be applied to the aliased name of the column in the inner query

Try this one
select SUM(RecQty) RecQty,ItemCode from
CostLedger group by ItemCode
having sum(RecQty) > 2000

Related

Why does adding GROUP BY cause a seemingly unrelated error?

The following code works fine:
SELECT name, (SELECT count(item_id) FROM bids WHERE item_id = items.id)
FROM items;
However, when I add
SELECT name, (SELECT count(item_id) FROM bids WHERE item_id = items.id)
FROM items
GROUP BY name;
I get ERROR: subquery uses ungrouped column "items.id" from outer query
Can anyone tell me why this is happening? Thanks!
If you GROUP BY name then any other columns you select from items must have an aggregate function applied. That's what GROUP BY means.
In your case, you are using another column from items -- id -- in a correlated scalar subquery. That's not an aggregate function, and id is not in the GROUP BY clause, so you get an error.
You could instead GROUP BY name, id. That should give you the same results as the first query, and is probably pointless.
If you actually have multiple rows in items with the same value for name, and you want to group the results of the scalar subquery for those values, you need to specify how to group them. Perhaps you want the total of the subquery results for each value of name. If so, I think you could do:
SELECT name, SUM(SELECT count(item_id) FROM bids WHERE item_id = items.id))
FROM items
GROUP BY name;
(I'm not positive about the specific syntax as I don't have a Postgres instance to test against.)
A clearer way to express it might be:
SELECT name, SUM(bid_count)
FROM (
SELECT name, (SELECT count(item_id) FROM bids WHERE item_id = items.id) AS bid_count
FROM items
)
GROUP BY name
Join the tables then perform the GROUP BY:
select i.name, count(b.item_id)
from items i
inner join bids b
on b.item_id = i.id
group by i.name
db<>fiddle here

not a single-group group function (00937. 00000)

SELECT p.pdept.dno,
MAX(SUM(p.budget)) AS max
FROM Proj111 p
GROUP BY p.pdept.dno
Although I have included all the necessary attributes in select statements to group by clause it will generate above issue? How to solve it?
You can't nest aggregate functions like that. If you want to get the maximum sum from each group, then you can try this:
WITH cte AS (
SELECT p.pdept.dno AS dept,
SUM(p.budget) AS budget
FROM Proj111 p
GROUP BY p.pdept.dno
)
SELECT t.dept,
t.budget
FROM cte t
WHERE t.budget = (SELECT MAX(budget) FROM cte)
The common table expression which I have named cte finds the budgets for each department. The query then restricts this result to the department with the maximum budget by again querying the cte for the maximum budget.

Oracle SQL use of subquery simulateneously in group by & select clauses in conjunction with CASE operator

Long title and strange problem:
I want to use the with-statement in oracle SQL to reuse a sub-query as well in the select as group by clause. Additionally, I use a case statement in order to create more information and group the results. This statement however throws following error: ORA-00979: not a GROUP BY expression.
Example query that is not working: I define a query containing the sum of the sales per product family. I sort this query and select the best selling product family out if it. As main result, I want to compare this top selling family to the sum of the sales of other product families (not one by one but all other product families grouped together). I do this following way:
WITH
top_family AS (
SELECT *
FROM (SELECT c.family
FROM products c, sales d
WHERE c.product_id= d.product_id
GROUP BY c.family
ORDER BY SUM(d.quantity) DESC)
WHERE ROWNUM = 1)
SELECT CASE
WHEN a.family IN (SELECT * FROM top_family)
THEN 'Most sold category'
ELSE 'Other categories'
END Family, SUM(a.price*b.quantity) "Total monetary sales"
FROM products a, sales b
WHERE a.product_id = b.product_id
GROUP BY CASE
WHEN a.family IN (SELECT * FROM top_family)
THEN 'Most sold category'
ELSE 'Other categories'
END
ORDER BY 1;
An interesting fact is that if I replace the sub-query 'top_family' as defined in the code above directly into the code (so replace every every place containing top_family with the select * from (select ...) statement), it works and gives the desired result.
The problem should probably be caused by using the sub-query defined in a with statement. Although I realize there are (better and more elegant) solutions than this one, I'd like to find out why I can't use the table alias "top_family" in the group by and select statement.
The problem is in the GROUP BY CASE WHEN statement.
This statement is only compiled in the final step of execution. This way that sub-clause is witheld from the SELECT CASE WHEN. This null operation is returning errors.
It is also described in the SQL manual.
After reading your requirements properly, I suggest you use something like this:
WITH
top_product AS (
SELECT s1.product_id
FROM sales s1
GROUP BY s1.product_id
HAVING sum(s1.quantity)
= (SELECT total_sale
FROM (SELECT SUM(s.quantity) AS total_sale
FROM sales s
GROUP BY s.product_id
GROUP BY SUM(s.quantity))
WHERE rownum = 1))
SELECT CASE t.product_id
WHEN null THEN 'Other categories'
ELSE 'Most sold category'
END Family,
SUM(a.price*b.quantity) "Total monetary sales"
FROM products a JOIN sales b
ON a.product_id = b.product_id
LEFT JOIN top_product t ON a.product_id = t.product_id
GROUP BY CASE t.product_id
WHEN null THEN 'Other categories'
ELSE 'Most sold category'
END
ORDER BY 1;

How can I use the GROUP BY SQL clause with no aggregate function?

When I try to use the following SELECT statement:
SELECT [lots of columns]
FROM Client, Customer, Document, Group
WHERE [some conditions]
GROUP BY Group.id
SQL Server complains that the columns I selected are not part of the GROUP BY statement nor an aggregate function. Am I using GROUP BY wrong? What should I be using instead?
To return all single occurences of a group by field, together with associated field values, write a query like:
select group_field,
max(other_field1),
max(other_field2),
...
from mytable1
join mytable2 on ...
group by group_field
having count(*) = 1;
Yes, you are using GROUP BY incorrectly. The point of using GROUP BY is to use aggregate functions. If you have no aggregrate functions you probably want SELECT DISTINCT instead.
SELECT DISTINCT
col1,
col2,
-- etc
coln
FROM Client
JOIN Customer ON ...
JOIN Document ON ...
JOIN [Group] ON ...
WHERE ...
My first guess would be that the problem is that you have table called Group, which I believe is a reserved word in SQL. Try wrapping the Group name with ' '
You want to group by all columns you are selecting that is not in an aggregate funcion.
SELECT ProductName, ProductCategory, SUM(ProductAmount)
FROM Products
GROUP BY ProductName, ProductCategory
This will give you a disticnt result of Product names and categories with the sum total of product amount in all aggregate child records for that group.

Optimize sql query with the rank function

This query gets the top item in each group using the ranking function.
I want to reduce the number of inner selects down to two instead of three. I tried using the rank() function in the innermost query, but couldn't get it working along with an aggregate function. Then I couldn't use a where clause on 'itemrank' without wrapping it in yet another select statement.
Any ideas?
select *
from (
select
tmp.*,
rank() over (partition by tmp.slot order by slot, itemcount desc) as itemrank
from (
select
i.name,
i.icon,
ci.slot,
count(i.itemid) as itemcount
from items i
inner join citems ci on ci.itemid = i.itemid
group by i.name, i.icon, ci.slot
) as tmp
) as popularitems
where itemrank = 1
EDIT: using sql server 2008
In Oracle and Teradata (and perhaps others too), you can use QUALIFY itemrank = 1 to get rid of the outer select. This is not part of the ANSI standard.
You can use Common Table Expressions in Oracle or in SQL Server.
Here is the syntax:
WITH expression_name [ ( column_name [,...n] ) ]
AS
( CTE_query_definition )
The list of column names is optional only if distinct names for all resulting columns are supplied in the query definition.
The statement to run the CTE is:
SELECT <column_list>
FROM expression_name;