Sql error in simple query? Any ideas? - sql

I'm getting the error:
Column 'A10000012VICKERS.dbo.IMAGES.idimage' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Any ideas why I would be getting this error or how to fix it? I thought that I was just asking for the size of a number of filestream columns and the values of two others?
SELECT
idimage,
filetype,
SUM(DATALENGTH(filestreamimageoriginal)) AS original,
SUM(DATALENGTH(filestreamimagefull)) AS [full],
SUM(DATALENGTH(filestreamimageextra)) AS extra,
SUM(DATALENGTH(filestreamimagelarge)) AS large,
SUM(DATALENGTH(filestreamimagemedium)) AS medium,
SUM(DATALENGTH(filestreamimagesmall)) AS small, SUM(DATALENGTH(filestreamimagethumbnail)) AS thumbnail
FROM A10000012VICKERS.dbo.IMAGES WHERE display = 1

I don't really see how that query could generate that message. There is no column with that name. However, the query does have an obvious error.
Your query is an aggregation query because it uses SUM() in the SELECT clause. However, this will return only one row, unless you also have a GROUP BY.
Add this to the end of your query:
GROUP BY idimage, filetype
Or, remove these columns from the SELECT.

By using an aggregation function (SUM) you are aggregating your records. As you have specified no GROUP BY clause you will get one result row, i.e. an aggregation over all rows. In this aggregation, however, there is no longer one idimage or one filetype that you could show in your results.
So either use an aggregation function on these, too (e.g. max(idimage), min(filetype)) or remove them from the query, if you really want one aggregate over all these rows.
If, however, you want to aggregate per idimage and filetype, then add GROUP BY idimage, filetype at the end of your query.

Related

exclude a column from group by statement

I would like to exclude a column from group by statement, because it results in some redundant records. Are there any recommendations?
I use Oracle, and have a complex query which join 6 tables together, and want to use sql aggregate function (count), without duplicate result.
You can't.
When using aggregate functions every column/column expression which is not an aggregate must be in the GROUP BY.
This is completely logical. If you're not aggregating the column then excluding it from the GROUP BY would force Oracle to chose a random value, which is not very useful.
If you don't want this column in your GROUP BY then you must decide what aggregation to apply to this column in order to return the appropriate data for your situation. You can't hand this responsibility off to the database engine.

SQL - Using MAX in a WHERE clause

Assume value is an int and the following query is valid:
SELECT blah
FROM table
WHERE attribute = value
Though MAX(expression) returns int, the following is not valid:
SELECT blah
FROM table
WHERE attribute = MAX(expression)
OF course the desired effect can be achieved using a subquery, but my question is why was SQL designed this way - is there some reason why this sort of thing is not allowed? Students coming from programming languages where you can always replace a data-type by a function call that returns that type find this issue confusing. Is there an explanation one can give them rather than just saying "that's the way it is"?
It's just because of the order of operations of a query.
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
WHERE just filters the rows returned by FROM. An aggregate function like MAX() can't have a result returned because it hasn't even been applied to anything.
That's also the reason, why you can't use aliases defined in the SELECT clause in a WHERE clause, but you can use aliases defined in FROM clause.
A where clause checks every row to see if it matches the conditions specified.
A max computes a single value from a row set. If you put a max, or any other aggregate function into a where clause, how can SQL server figure out what rows the max function can use until the where clause has finished it filter?
This deals with the order that SQL Server processes commands in. It runs the WHERE clause before a GROUP BY or any aggregate. Since a where clause runs first, SQL Server can't tell if a row will be included in an aggregate until it processes the where. That is what the HAVING clause is for. HAVING runs after the GROUP BY and the WHERE and can include MAX since you have already filtered out the rows you don't want to use. See http://www.bennadel.com/blog/70-SQL-Query-Order-of-Operations.htm for a good explanation of the order in which SQL commands run.
Maybe this work
SELECT blah
FROM table
WHERE attribute = (SELECT MAX(expresion) FROM table1)
The WHERE clause is specifically designed to test conditions against raw data (individual rows of the table). However, MAX is an aggregate function over multiple rows of data. Basically, without a sub-select, the WHERE clause knows nothing about any rows in the table except for the current row. So how can you determine the maximum value over a whole bunch of rows when you don't even know what those rows are?
Yes, it's a little bit of a simplification, especially when dealing with joins, but the same principle applies. WHERE is always row-by-row, so that's all it really knows about.
Even if you have a GROUP BY clause, the WHERE clause still only processes one row at a time in the raw data before grouping. It doesn't know the value of a column in any other rows, so it has no way of knowing which row has the maximum value.
Assuming this is MS SQL Server, the following would work.
SELECT TOP 1 blah
FROM table
ORDER BY expression DESC

SQL column reference is invalid

I am using Jaspersoft's iReport to create a report that will pull data from my Maintenance Assistant CMMS database. The DB is on the localhost, and I am not creating any tables or columns. MA CMMS takes care of that. I only want to pull the data to arrange in a report.
Here is my code:
SELECT *
FROM "tblworkordertask"
WHERE "dbltimespenthours" > 0
AND "dtmdatecompleted" BETWEEN $P{DATE_FROM} AND $P{DATE_TO}
GROUP BY "intworkorderid"
and my error:
Caused by: java.sql.SQLSyntaxErrorException: Column reference 'tblWorkOrderTask.id' is invalid, or is part of an invalid expression.  For a SELECT list with a GROUP BY, the columns and expressions being selected may only contain valid grouping expressions and valid aggregate expressions.
I don't know why the error is referring to 'tblWorkOrderTask.id' because I don't have such a column, nor did I ask for that column.
If I take out the group by clause, it works fine, but as you could expect, I get multiple results with the same WorkOrderID. I want to group it by this column, and then count the results. I tried using SELECT DISTINCT, but then I get errors about columns that aren't selected.
You're selecting all columns in the tblWorkOrderTask table. The "id" column is the first column in that table. You are getting an error because you do not have all columns specified in the select list.
This select would work, but I'm not sure what information you need out of your table.
SELECT id, intworkorderid
FROM tblWorkOrderTask
group by id, intworkorderid
http://www.w3schools.com/sql/sql_groupby.asp
Get rid of the GROUP BY clause -- if you're just trying to order the result, then use ORDER BY instead; but otherwise, you don't need either.
EDIT
As the error says, everythign in your SELECT list must be one of two things -- either 1) also listed in your GROUP BY list, or 2) an aggregated value. Here is a sample that will work:
SELECT intworkorderid, COUNT(*)
FROM "tblworkordertask"
WHERE "dbltimespenthours" > 0
AND "dtmdatecompleted" BETWEEN $P{DATE_FROM} AND $P{DATE_TO}
GROUP BY "intworkorderid"
Yes - in order to use group by, you need to be specific in the select line.
So first, decide which fields you want to display. If you want them all, then include them all.
As soon as you add a COUNT() function to get a count of the selected fields, you will need to add the GROUP BY clause. COUNT() is an AGGREGATE function, like SUM() and AVG().
It's a little counter-intuitive and a bit of a pain to specify so many fields in the GROUP BY clause, but it's necessary.
The FIRST GROUP BY field is the most important, since this is usually what you are concerned about.
This first field can be any of the SELECTed fields, it is not necessarily the first.
Include EVERY field in your GROUP BY that is not an AGGREGATE function like COUNT().
Also, if you are trying to COUNT a group of orders, you probably don't want or need all of the fields in the SELECT.
You probably want to specify just the fields that are unique to the work order ID.
Example: If you want to get a COUNT of these fields, you would specify all of the SELECTED fields EXCEPT the COUNT().
SELECT
intWorkOrderID,
COUNT(id),
strDescription
FROM tblworkordertask
WHERE dbltimespenthours > 0
AND dtmdatecompleted BETWEEN $P{DATE_FROM} AND $P{DATE_TO}
GROUP BY
intworkorderid,
strDescription

Why can't I perform an aggregate function on an expression containing an aggregate but I can do so by creating a new select statement around it?

Why is it that in SQL Server I can't do this:
select sum(count(id)) as 'count'
from table
But I can do
select sum(x.count)
from
(
select count(id) as 'count'
from table
) x
Are they not essentially the same thing? How am I meant to be thinking about this in order to understand why the first block of code isn't allowed?
SUM() in your example is a no-op - SUM() of a COUNT() means the same as just COUNT(). So neither of your example queries appear to do anything useful.
It seems to me that nesting aggregates would only make sense if you wanted to apply two different aggregations - meaning GROUP BY on different sets of columns. To specify two different aggregations you would need to use the GROUPING SETS feature or SUM() OVER feature. Maybe if you explain what you want to achieve someone could show you how.
The gist of the issue is that there is no such concept as aggregate of an aggregate applied to a relation, see Aggregation. Having such a concept would leave too many holes in the definition and makes the GROUP BY clause impossible to express: it needs to define both the inner aggregate GROUP BY clause and the outer aggregate as well! This applies also to the other aggregate attributes, like the HAVING clause.
However, the result of an aggregate applied to a relation is another relation, and this result relation in turn can support a new aggregate operator. This explains why you can aggregate the result into an outer SELECT. This leaves no ambiguity in the definition, each SELECT has its own distinct GROUP BY/HAVING clauses.
In simple terms, aggregation functions operate over a column and generate a scalar value, hence they cannot be applied over their result. When you create a select statement over a scalar value you transform it into an artificial column, that's why it can be used by an aggregation function again.
Please note that most of the times there's no point in applying an aggregation function over the result of another aggregation function: in your sample sum(count(id)) == count(id).
i would like to know what your expected result in this sql
select sum(count(id)) as 'count'
from table
when you use the count function, only 1 result(total count) will be return. So, may i ask why you want to sum the only 1 result.
You will surely got the error because an aggregate function cannot perform on an expression containing an aggregate or a subquery.
It's working for me using SQLFiddle, not sure why it would't work for you. But I do have an explanation as to why it might not be working for you and why the alternative would work...
Your example is using a keyword as a column name, that may not always work. But when the column is only in a sub expression, the query engine is free to discard the name (in fact it probaly does) so the fact that it potentially potentially conflicts with a key word may be disregarded.
EDIT: in response to your edit/comment. No, the two aren't equivalent. The RESULT would be equivalent, but the process of getting to that result is not at all similar. For the first to work, the parser has do some work that simply doesn't make sense for it to do (applying an aggregate to a single value, either on a row by row basis or as), in the second case, an aggregate is applied to a table. The fact that the table is a temporary virtual table will be unimportant to the aggregate function.
I think you can write the sql query, which produces 'count' of rows for the required output. Functions do not take aggregated functions like 'sum' or aggregated subquery. My problem was resolved by using a simple sql query to get the count out....
Microsoft SQL Server doesn’t support it.
You can get around this problem by using a Derived table:
select sum(x.count)
from
(
select count(id) as 'count'
from table
) x
On the other hand using the below code will give you an error message.
select sum(count(id)) as 'count'
from table
Cannot perform an aggregate function on an expression containing an
aggregate or a subquery

Any reason for GROUP BY clause without aggregation function?

I'm (thoroughly) learning SQL at the moment and came across the GROUP BYclause.
GROUP BY aggregates or groups the resultset according to the argument(s) you give it. If you use this clause in a query you can then perform aggregate functions on the resultset to find statistical information on the resultset like finding averages (AVG()) or frequency (COUNT()).
My question is: is the GROUP BY statement in any way useful without an accompanying aggregate function?
Update
Using GROUP BY as a synonym for DISTINCT is (probably) a bad idea because I suspect it is slower.
is the GROUP BY statement in any way useful without an accompanying aggregate function?
Using DISTINCT would be a synonym in such a situation, but the reason you'd want/have to define a GROUP BY clause would be in order to be able to define HAVING clause details.
If you need to define a HAVING clause, you have to define a GROUP BY - you can't do it in conjunction with DISTINCT.
You can perform a DISTINCT select by using a GROUP BY without any AGGREGATES.
Group by can used in Two way Majorly
1)in conjunction with SQL aggregation functions
2)to eliminate duplicate rows from a result set
SO answer to your question lies in second part of USEs above described.
Note: everything below only applies to MySQL
GROUP BY is guaranteed to return results in order, DISTINCT is not.
GROUP BY along with ORDER BY NULL is of same efficiency as DISTINCT (and implemented in the say way). If there is an index on the field being aggregated (or distinctified), both clauses use loose index scan over this field.
In GROUP BY, you can return non-grouped and non-aggregated expressions. MySQL will pick any random values from from the corresponding group to calculate the expression.
With GROUP BY, you can omit the GROUP BY expressions from the SELECT clause. With DISTINCT, you can't. Every row returned by a DISTINCT is guaranteed to be unique.
It is used for more then just aggregating functions.
For example, consider the following code:
SELECT product_name, MAX('last_purchased') FROM products GROUP BY product_name
This will return only 1 result per product, but with the latest updated value of that records.