AVG , group by, WHERE AVG greater (>) issue - sql

this is my database
CREATE TABLE korisnici(
name VARCHAR(30) NOT NULL,
amount DECIMAL(65,2)
);
INSERT INTO korisnici VALUES
("Marina",20.10),
("Petar",300.50),
("Ivana",100.70),
("Tomislav",50.20),
("Ivana",80.60),
("Petar",10.40),
("Marina",80.50),
("Ivana",70.50),
("Marina",130.20),
("Robert",60.20),
("Blanka",130.20),
("Blanka",220.40),
("Tomislav",150.20);
I would like to fetch all names from list which has average ammount of all their amounts greater than 150. Something like I tried
SELECT name, AVG(amount) AS avg FROM `korisnici` WHERE avg > 150 GROUP BY name
However my query fails, with error "Unknown column 'avg' in 'where clause'". Can someone give me a hint.

You can't use a column alias in a WHERE, JOIN, or HAVING clause, so you need to repeat the expression, but that's not the only problem. When filtering on the result of an aggregation, the HAVING clause should be used instead of WHERE:
SELECT name, AVG(amount) AS avg
FROM `korisnici`
GROUP BY name
HAVING AVG(amount) > 150
The reason is that the WHERE clause is applied before the grouping and aggregation (and is used to determine which records get grouped and aggregated), while HAVING is applied after the aggregation.

You can not write like that: it is a common SQL error.
avg is the identifier and you can not use an identifier in the where clause..
SELECT name, AVG(amount) AS avg
FROM `korisnici`
WHERE AVG(amount) > 150 GROUP BY name;
There you go..

Related

Use of HAVING without GROUP BY not working as expected

I am starting to learn SQL Server, in the documentation found in msdn states like this
HAVING is typically used with a GROUP BY clause. When GROUP BY is not used, there is an implicit single, aggregated group.
This made me to think that we can use having without a groupBy clause, but when I am trying to make a query I am not able to use it.
I have a table like this
CREATE TABLE [dbo].[_abc]
(
[wage] [int] NULL
) ON [PRIMARY]
GO
INSERT INTO [dbo].[_abc] (wage)
VALUES (4), (8), (15), (30), (50)
GO
Now when I run this query, I get an error
select *
from [dbo].[_abc]
having sum(wage) > 5
Error:
The documentation is correct; i.e. you could run this statement:
select sum(wage) sum_of_all_wages
, count(1) count_of_all_records
from [dbo].[_abc]
having sum(wage) > 5
The reason your statement doesn't work is because of the select *, which means select every columns' value. When there is no group by, all records are aggregated; i.e. you only get 1 record in your result set which has to represent every record. As such, you can only* include values provided by applying aggregate functions to your columns; not the columns themselves.
* of course, you can also provide constants, so select 'x' constant, count(1) cnt from myTable would work.
There aren't many use cases I can think of where you'd want to use having without a group by, but certainly it can be done as shown above.
NB: If you wanted all rows where the wage was greater than 5, you'd use the where clause instead:
select *
from [dbo].[_abc]
where wage > 5
Equally, if you want the sum of all wages greater than 5 you can do this
select sum(wage) sum_of_wage_over_5
from [dbo].[_abc]
where wage > 5
Or if you wanted to compare the sum of wages over 5 with those under:
select case when wage > 5 then 1 else 0 end wage_over_five
, sum(wage) sum_of_wage
from [dbo].[_abc]
group by case when wage > 5 then 1 else 0 end
See runnable examples here.
Update based on comments:
Do you need having to use aggregate functions?
No. You can run select sum(wage) from [dbo].[_abc]. When an aggregate function is used without a group by clause, it's as if you're grouping by a constant; i.e. select sum(wage) from [dbo].[_abc] group by 1.
The documentation merely means that whilst normally you'd have a having statement with a group by statement, it's OK to exclude the group by / in such cases the having statement, like the select statement, will treat your query as if you'd specified group by 1
What's the point?
It's hard to think of many good use cases, since you're only getting one row back and the having statement is a filter on that.
One use case could be that you write code to monitor your licenses for some software; if you have less users than per-user-licenses all's good / you don't want to see the result since you don't care. If you have more users you want to know about it. E.g.
declare #totalUserLicenses int = 100
select count(1) NumberOfActiveUsers
, #totalUserLicenses NumberOfLicenses
, count(1) - #totalUserLicenses NumberOfAdditionalLicensesToPurchase
from [dbo].[Users]
where enabled = 1
having count(1) > #totalUserLicenses
Isn't the select irrelevant to the having clause?
Yes and no. Having is a filter on your aggregated data. Select says what columns/information to bring back. As such you have to ask "what would the result look like?" i.e. Given we've had to effectively apply group by 1 to make use of the having statement, how should SQL interpret select *? Since your table only has one column this would translate to select wage; but we have 5 rows, so 5 different values of wage, and only 1 row in the result to show this.
I guess you could say "I want to return all rows if their sum is greater than 5; otherwise I don't want to return any rows". Were that your requirement it could be achieved a variety of ways; one of which would be:
select *
from [dbo].[_abc]
where exists
(
select 1
from [dbo].[_abc]
having sum(wage) > 5
)
However, we have to write the code to meet the requirement, rather than expect the code to understand our intent.
Another way to think about having is as being a where statement applied to a subquery. I.e. your original statement effectively reads:
select wage
from
(
select sum(wage) sum_of_wage
from [dbo].[_abc]
group by 1
) singleRowResult
where sum_of_wage > 5
That won't run because wage is not available to the outer query; only sum_of_wage is returned.
HAVING without GROUP BY clause is perfectly valid but here is what you need to understand:
The result will contain zero or one row
The implicit GROUP BY will return exactly one row even if the WHERE condition matched zero rows
HAVING will keep or eliminate that single row based on the condition
Any column in the SELECT clause needs to be wrapped inside an aggregate function
You can also specify an expression as long as it is not functionally dependent on the columns
Which means you can do this:
SELECT SUM(wage)
FROM employees
HAVING SUM(wage) > 100
-- One row containing the sum if the sum is greater than 5
-- Zero rows otherwise
Or even this:
SELECT 1
FROM employees
HAVING SUM(wage) > 100
-- One row containing "1" if the sum is greater than 5
-- Zero rows otherwise
This construct is often used when you're interested in checking if a match for the aggregate was found:
SELECT *
FROM departments
WHERE EXISTS (
SELECT 1
FROM employees
WHERE employees.department = departments.department
HAVING SUM(wage) > 100
)
-- all departments whose employees earn more than 100 in total
In SQL you cannot return aggregate functioned columns directly. You need to group the non aggregate fields
As shown below example
USE AdventureWorks2012 ;
GO
SELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING SUM(LineTotal) > 100000.00
ORDER BY SalesOrderID ;
In your case you don't have identity column for your table it should come as below
Alter _abc
Add Id_new Int Identity(1, 1)
Go

how to find maximum of sum of number using if else in procedure in sap hana sql

I want to list out the product which has highest sales amount on date wise.
note: highest sales amount in the sense max(sum(sales_amnt)...
by using if or case In the procedure in sap hana SQL....
I did this by using with the clause :
/--------------------------CORRECT ONE ----------------------------------------------/
WITH ranked AS
(
SELECT Dense_RAnk() OVER (ORDER BY SUM("SALES_AMNT"), "SALES_DATE", "PROD_NAME") as rank,
SUM("SALES_AMNT") AS Amount, "PROD_NAME",count(*), "SALES_DATE" FROM "KABIL"."DATE"
GROUP BY "SALES_DATE", "PROD_NAME"
)
SELECT "SALES_DATE", "PROD_NAME",Amount
FROM ranked
WHERE rank IN ( select MAX(rank) from ranked group by "SALES_DATE")
ORDER BY "SALES_DATE" DESC;
this is my table
You can not use IF along with SELECT statement. Note that, you can achieve most of boolean logics with CASE statement syntax
In select, you are applying it over a column and your logic will be executed as many as times the count of result set rows. Hence , righting an imperative logic is not well appreciated. Still, if you want to do the same, create a calculation view and use intermediate calculated columns to achieve what you are expecting .
try this... i got an answer ...
select "SALES_DATE","PROD_NAME",sum("SALES_AMNT")
from "KABIL"."DATE"
group by "SALES_DATE","PROD_NAME"
having (SUM("SALES_AMNT"),"SALES_DATE") IN (select
MAX(SUM_SALES),"SALES_DATE"
from (select SUM("SALES_AMNT")
as
SUM_SALES,"SALES_DATE","PROD_NAME"
from "KABIL"."DATE"
group by "SALES_DATE","PROD_NAME"
)
group by "SALES_DATE");

SQL Server Query is invalid

I'm having the following error with this query in SQL server 2014 "Operand data type varchar is invalid for sum operator."
SUM (DISTINCT (studentsip.AdminNO)) AS NoOfStudentsAllocated
If you want to count the number of students (as suggested by the column name), then use COUNT(), not SUM():
COUNT(DISTINCT studentsip.AdminNO) AS NoOfStudentsAllocated
I have a certain amount of experience with SQL. I have never used SUM(DISTINCT). I wish the language did not allow the syntax.
I should note that if the DISTINCT is not needed, then you should not use it. DISTINCT almost always slows down queries.
Your field is of type varchar. To use it in sum() you need to convert() it to int:
sum(distinct(convert(int,studentsip.AdminNO))) as NoOfStudentsAllocated
If you want the number of students that have each value of AdminNo, you can use count and group by:
select AdminNO, count(1) as NoOfStudentsAllocated
from studentsip
group by AdminNO
order by AdminNO

sql divide column by column max

I have a column of count and want to divide the column by max of this column to get the rate.
I tried
select t.count/max(t.count)
from table t
group by t.count
but failed.
I also tried the one without GROUP BY, still failed.
Order the count desc and pick the first one as dividend didn't work in my case. Consider I have different counts for product subcategory. For each product category, I want to divide the count of subcategory by the max of count in that category. I can't think of a way avoiding aggregate func.
If you want the MAX() per category you need a correlated subquery:
select t.count*1.0/(SELECT max(t.count)
FROM table a
WHERE t.category = a.category)
from table t
Or you need to PARTITION BY your MAX()
select t.count/(max(t.count) over (PARTITION BY category))
from table t
group by t.count
The following works in all dialects of SQL:
select t.count/(select max(t.count) from t)
from table t
group by t.count;
Note that some versions of SQL do integer division, so the result will be either 0 or 1. You can fix this by multiplying by 1.0 or casting to a float.
Most versions of SQL also support:
select t.count/(max(t.count) over ())
from table t
group by t.count;
The same caveat applies about integer division.
You might want to try using a subquery to derive the max value (including both in the same query might not work the way that you are expecting, since you are grouping on the same column that you are aggregating)
Select t.count / (select max(sub.count) from table sub)
from table t
group by t.count

Average when meeting where clause

I want to find the average amount of a field where it meets a criterion. It is embedded in a big table but I would like this average field in there instead of doing it in a separate table.
This is what I have so far:
Select....
Avg( (currbal) where (select * from table
where ament2 in ('r1','r2'))
From table
If you want to AVG only a subset of a query use case when ... then to replace value in non-matching rows with null as nulls are ignored by avg().
Select id,
sum(something) SomethingSummed,
avg(case when ament2 in ('r1','r2') then currbal end) CurrbalAveragedForR1R2
From [table]
group by id
You can put all the other sums which you want to be embedded into the AVG statement, inside the table reference inside the FROM clause. Something like:
SELECT AVG(currbal)
FROM
(
SELECT * -- other sums
FROM table
WHERE ament2 IN ('r1','r2')
) t
You can write a full sub-select into the select list:
SELECT ...,
(SELECT AVG(Currbal) FROM Table WHERE ament2 IN ('r1', 'r2')) AS avg_currbal,
...
FROM ...
Whether that will do exactly what you want depends on a number of things. You might need to make that into a correlated subquery; assuming 'ament2' is in Table, it is not a correlated sub-query at the moment.