"You tried to execute a query that does not include the specified aggregate function" - sql

SELECT SUM(orders.quantity) AS num, fName, surname
FROM author
INNER JOIN book ON author.aID = book.authorID;
I keep getting the error message: "you tried to execute a query that does not include the specified expression "fName" as part of an aggregate function. What do I do?

The error is because fName is included in the SELECT list, but is not included in a GROUP BY clause and is not part of an aggregate function (Count(), Min(), Max(), Sum(), etc.)
You can fix that problem by including fName in a GROUP BY. But then you will face the same issue with surname. So put both in the GROUP BY:
SELECT
fName,
surname,
Count(*) AS num_rows
FROM
author
INNER JOIN book
ON author.aID = book.authorID;
GROUP BY
fName,
surname
Note I used Count(*) where you wanted SUM(orders.quantity). However, orders isn't included in the FROM section of your query, so you must include it before you can Sum() one of its fields.
If you have Access available, build the query in the query designer. It can help you understand what features are possible and apply the correct Access SQL syntax.

I had a similar problem in a MS-Access query, and I solved it by changing my equivalent fName to an "Expression" (as opposed to "Group By" or "Sum"). So long as all of my fields were "Expression", the Access query builder did not require any Group By clause at the end.

GROUP BY can be selected from Total row in query design view in MS Access.
If Total row not shown in design view (as in my case). You can go to SQL View and add GROUP By fname etc. Then Total row will automatically show in design view.
You have to select as Expression in this row for calculated fields.

Related

Fields that not used in`GROUP BY` is not reachable in HAVING clauses

Fields that not used in GROUP BY are not usable in SELECT but they're usable in WHERE. This makes sense since WHEN comes before GROUP BY but shouldn't HAVING has to be able to access "other" columns of the row.
Example
Below is valid.
select fid, count(*)
from class
inner join faculty using (fid)
group by fid
having every(class.room = 'R128')
But can't do this.
select fid, count(*)
from class
inner join faculty using (fid)
group by fid
having class.room = 'R128' // Changed Line
Error message of above snippet:
RROR: column "class.room" must appear in the GROUP BY clause or be used in an aggregate function
LINE 7: having class.room = 'R128'
^
SQL state: 42803
Character: 86
I didn't fall into XY Problem, I want to know why this is impossible (Question is correct with every() later is wrong in semantics too for the question)
having is used to filter the result of the grouping.
However the room column is neither part of an aggregate nor part of the GROUP BY.
every() is an aggregate function an thus it's allowed in the having clause.
You can only use aggregate functions in a HAVING clause and “every” is an aggregate function

Query working in MySQL but trowing error in Oracle can someone please explain. and tell me how to rewrite this same query in oracle to avoid error [duplicate]

I have a query
SELECT COUNT(*) AS "CNT",
imei
FROM devices
which executes just fine. I want to further restrict the query with a WHERE statement. The (humanly) logical next step is to modify the query followingly:
SELECT COUNT(*) AS "CNT",
imei
FROM devices
WHERE CNT > 1
However, this results in a error message ORA-00904: "CNT": invalid identifier. For some reason, wrapping the query in another query produces the desired result:
SELECT *
FROM (SELECT COUNT(*) AS "CNT",
imei
FROM devices
GROUP BY imei)
WHERE CNT > 1
Why does Oracle not recognize the alias "CNT" in the second query?
Because the documentation says it won't:
Specify an alias for the column
expression. Oracle Database will use
this alias in the column heading of
the result set. The AS keyword is
optional. The alias effectively
renames the select list item for the
duration of the query. The alias can
be used in the order_by_clause but not
other clauses in the query.
However, when you have an inner select, that is like creating an inline view where the column aliases take effect, so you are able to use that in the outer level.
The simple answer is that the AS clause defines what the column will be called in the result, which is a different scope than the query itself.
In your example, using the HAVING clause would work best:
SELECT COUNT(*) AS "CNT",
imei
FROM devices
GROUP BY imei
HAVING COUNT(*) > 1
To summarize, this little gem explains:
10 Easy Steps to a Complete Understanding of SQL
A common source of confusion is the simple fact that SQL syntax
elements are not ordered in the way they are executed. The lexical
ordering is:
SELECT [ DISTINCT ]
FROM
WHERE
GROUP BY
HAVING
UNION
ORDER BY
For simplicity, not all SQL clauses are listed. This lexical ordering
differs fundamentally from the logical order, i.e. from the order of
execution:
FROM
WHERE
GROUP BY
HAVING
SELECT
DISTINCT
UNION
ORDER BY
As a consequence, anything that you label using "AS" will only be available once the WHERE, HAVING and GROUP BY have already been performed.
I would imagine because the alias is not assigned to the result column until after the WHERE clause has been processed and the data generated. Is Oracle different from other DBMSs in this behaviour?

Why are aggregate functions not allowed in where clause

I am looking for clarification on this. I am writing two queries below:
We have a table of employee name with columns ID , name , salary
1. Select name from employee
where sum(salary) > 1000 ;
2. Select name from employee
where substring_index(name,' ',1) = 'nishant' ;
Query 1 doesn't work but Query 2 does work. From my development experience, I feel the possible explanation to this is:
The sum() works on a set of values specified in the argument. Here
'salary' column is passed , so it must add up all the values of this
column. But inside where clause, the records are checked one by one ,
like first record 1 is checked for the test and so on. Thus
sum(salary) will not be computed as it needs access to all the column
values and then only it will return a value.
Query 2 works as substring_index() works on a single value and hence here it works on the value supplied to it.
Can you please validate my understanding.
The reason you can't use SUM() in the WHERE clause is the order of evaluation of clauses.
FROM tells you where to read rows from. Right as rows are read from disk to memory, they are checked for the WHERE conditions. (Actually in many cases rows that fail the WHERE clause will not even be read from disk. "Conditions" are formally known as predicates and some predicates are used - by the query execution engine - to decide which rows are read from the base tables. These are called access predicates.) As you can see, the WHERE clause is applied to each row as it is presented to the engine.
On the other hand, aggregation is done only after all rows (that verify all the predicates) have been read.
Think about this: SUM() applies ONLY to the rows that satisfy the WHERE conditions. If you put SUM() in the WHERE clause, you are asking for circular logic. Does a new row pass the WHERE clause? How would I know? If it will pass, then I must include it in the SUM, but if not, it should not be included in the SUM. So how do I even evaluate the SUM condition?
Why can't we use aggregate function in where clause
Aggregate functions work on sets of data. A WHERE clause doesn't have access to entire set, but only to the row that it is currently working on.
You can of course use HAVING clause:
select name from employee
group by name having sum(salary) > 1000;
If you must use WHERE, you can use a subquery:
select name from (
select name, sum(salary) total_salary from employee
group by name
) t where total_salary > 1000;
sum() is an aggregation function. In general, you would expect it to work with group by. Hence, your first query is missing a group by. In a group by query, having is used for filtering after the aggregation:
Select name
from employee
group by name
having sum(salary) > 1000 ;
Using having works since the query goes direct to the rows in that column while where fails since the query keep looping back and forth whenever conditions is not met.

Select all columns on a group by throws error

I ran a query against Northwind database Products Table like below
select * from Northwind.dbo.Products GROUP BY CategoryID and i was hit with a error. I am sure you will also be hit by same error. So what is the correct statement that i need to execute to group all products with respect to their category id's.
edit: this like really helped understand a lot
http://weblogs.sqlteam.com/jeffs/archive/2007/07/20/but-why-must-that-column-be-contained-in-an-aggregate.aspx
You need to use an Aggregate function and then group by any non-aggregated columns.
I recommend reading up on GROUP BY.
If you're using GROUP BY in a query, all items in your SELECT statement must either be contained as part of an aggregate function, e.g. Sum() or Count(), else they will also need to be included in the GROUP BY clause.
Because you are using SELECT *, this is equivalent to listing ALL columns in your SELECT.
Therefore, either list them all in the GROUP BY too, use aggregating functions for the rest where possible, or only select the CategoryID.

What is the difference between HAVING and WHERE in SQL?

What is the difference between HAVING and WHERE in an SQL SELECT statement?
EDIT: I have marked Steven's answer as the correct one as it contained the key bit of information on the link:
When GROUP BY is not used, HAVING behaves like a WHERE clause
The situation I had seen the WHERE in did not have GROUP BY and is where my confusion started. Of course, until you know this you can't specify it in the question.
HAVING: is used to check conditions after the aggregation takes place.
WHERE: is used to check conditions before the aggregation takes place.
This code:
select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City
Gives you a table of all cities in MA and the number of addresses in each city.
This code:
select City, CNT=Count(1)
From Address
Where State = 'MA'
Group By City
Having Count(1)>5
Gives you a table of cities in MA with more than 5 addresses and the number of addresses in each city.
HAVING specifies a search condition for a
group or an aggregate function used in SELECT statement.
Source
Number one difference for me: if HAVING was removed from the SQL language then life would go on more or less as before. Certainly, a minority queries would need to be rewritten using a derived table, CTE, etc but they would arguably be easier to understand and maintain as a result. Maybe vendors' optimizer code would need to be rewritten to account for this, again an opportunity for improvement within the industry.
Now consider for a moment removing WHERE from the language. This time the majority of queries in existence would need to be rewritten without an obvious alternative construct. Coders would have to get creative e.g. inner join to a table known to contain exactly one row (e.g. DUAL in Oracle) using the ON clause to simulate the prior WHERE clause. Such constructions would be contrived; it would be obvious there was something was missing from the language and the situation would be worse as a result.
TL;DR we could lose HAVING tomorrow and things would be no worse, possibly better, but the same cannot be said of WHERE.
From the answers here, it seems that many folk don't realize that a HAVING clause may be used without a GROUP BY clause. In this case, the HAVING clause is applied to the entire table expression and requires that only constants appear in the SELECT clause. Typically the HAVING clause will involve aggregates.
This is more useful than it sounds. For example, consider this query to test whether the name column is unique for all values in T:
SELECT 1 AS result
FROM T
HAVING COUNT( DISTINCT name ) = COUNT( name );
There are only two possible results: if the HAVING clause is true then the result with be a single row containing the value 1, otherwise the result will be the empty set.
The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.
Check out this w3schools link for more information
Syntax:
SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value
A query such as this:
SELECT column_name, COUNT( column_name ) AS column_name_tally
FROM table_name
WHERE column_name < 3
GROUP
BY column_name
HAVING COUNT( column_name ) >= 3;
...may be rewritten using a derived table (and omitting the HAVING) like this:
SELECT column_name, column_name_tally
FROM (
SELECT column_name, COUNT(column_name) AS column_name_tally
FROM table_name
WHERE column_name < 3
GROUP
BY column_name
) pointless_range_variable_required_here
WHERE column_name_tally >= 3;
The difference between the two is in the relationship to the GROUP BY clause:
WHERE comes before GROUP BY; SQL evaluates the WHERE clause before it groups records.
HAVING comes after GROUP BY; SQL evaluates HAVING after it groups records.
References
SQLite SELECT Statement Syntax/Railroad Diagram
Informix SELECT Statement Syntax/Railroad Diagram
HAVING is used when you are using an aggregate such as GROUP BY.
SELECT edc_country, COUNT(*)
FROM Ed_Centers
GROUP BY edc_country
HAVING COUNT(*) > 1
ORDER BY edc_country;
WHERE is applied as a limitation on the set returned by SQL; it uses SQL's built-in set oeprations and indexes and therefore is the fastest way to filter result sets. Always use WHERE whenever possible.
HAVING is necessary for some aggregate filters. It filters the query AFTER sql has retrieved, assembled, and sorted the results. Therefore, it is much slower than WHERE and should be avoided except in those situations that require it.
SQL Server will let you get away with using HAVING even when WHERE would be much faster. Don't do it.
WHERE clause does not work for aggregate functions
means : you should not use like this
bonus : table name
SELECT name
FROM bonus
GROUP BY name
WHERE sum(salary) > 200
HERE Instead of using WHERE clause you have to use HAVING..
without using GROUP BY clause, HAVING clause just works as WHERE clause
SELECT name
FROM bonus
GROUP BY name
HAVING sum(salary) > 200
Difference b/w WHERE and HAVING clause:
The main difference between WHERE and HAVING clause is, WHERE is used for row operations and HAVING is used for column operations.
Why we need HAVING clause?
As we know, aggregate functions can only be performed on columns, so we can not use aggregate functions in WHERE clause. Therefore, we use aggregate functions in HAVING clause.
One way to think of it is that the having clause is an additional filter to the where clause.
A WHERE clause is used filters records from a result. The filter occurs before any groupings are made. A HAVING clause is used to filter values from a group
In an Aggregate query, (Any query Where an aggregate function is used) Predicates in a where clause are evaluated before the aggregated intermediate result set is generated,
Predicates in a Having clause are applied to the aggregate result set AFTER it has been generated. That's why predicate conditions on aggregate values must be placed in Having clause, not in the Where clause, and why you can use aliases defined in the Select clause in a Having Clause, but not in a Where Clause.
I had a problem and found out another difference between WHERE and HAVING. It does not act the same way on indexed columns.
WHERE my_indexed_row = 123 will show rows and automatically perform a "ORDER ASC" on other indexed rows.
HAVING my_indexed_row = 123 shows everything from the oldest "inserted" row to the newest one, no ordering.
When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.
However, when GROUP BY is used:
The WHERE clause is used to filter records from a result. The
filtering occurs before any groupings are made.
The HAVING clause is used to filter values from a group (i.e., to
check conditions after aggregation into groups has been performed).
Resource from Here
From here.
the SQL standard requires that HAVING
must reference only columns in the
GROUP BY clause or columns used in
aggregate functions
as opposed to the WHERE clause which is applied to database rows
While working on a project, this was also my question. As stated above, the HAVING checks the condition on the query result already found. But WHERE is for checking condition while query runs.
Let me give an example to illustrate this. Suppose you have a database table like this.
usertable{ int userid, date datefield, int dailyincome }
Suppose, the following rows are in table:
1, 2011-05-20, 100
1, 2011-05-21, 50
1, 2011-05-30, 10
2, 2011-05-30, 10
2, 2011-05-20, 20
Now, we want to get the userids and sum(dailyincome) whose sum(dailyincome)>100
If we write:
SELECT userid, sum(dailyincome) FROM usertable WHERE
sum(dailyincome)>100 GROUP BY userid
This will be an error. The correct query would be:
SELECT userid, sum(dailyincome) FROM usertable GROUP BY userid HAVING
sum(dailyincome)>100
WHERE clause is used for comparing values in the base table, whereas the HAVING clause can be used for filtering the results of aggregate functions in the result set of the query
Click here!
When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.
However, when GROUP BY is used:
The WHERE clause is used to filter records from a result. The
filtering occurs before any groupings are made.
The HAVING clause is
used to filter values from a group (i.e., to check conditions after
aggregation into groups has been performed).
I use HAVING for constraining a query based on the results of an aggregate function. E.G. select * in blahblahblah group by SOMETHING having count(SOMETHING)>0