aggregate of an empty result set - sql

I would like the aggregates of an empty result set to be 0. I have tried the following:
SELECT SUM(COALESCE(capacity, 0))
FROM objects
WHERE null IS NOT NULL;
Result:
sum
-----
(1 row)
Subquestion: wouldn't the above work in Oracle, using SUM(NVL(capacity, 0))?

From the documentation page about aggregate functions:
It should be noted that except for count, these functions return a null value when no rows are selected. In particular, sum of no rows returns null, not zero as one might expect. The coalesce function may be used to substitute zero for null when necessary.
So, if you want to guarantee a value returned, apply COALESCE to the result of SUM, not to its argument:
SELECT COALESCE(SUM(capacity), 0) …
As for the Oracle 'subquestion', well, I couldn't find any notion of NULLs at the official doc page (the one for 10.2, in particular), but two other sources are unambiguous:
Oracle SQL Functions:
SUM([DISTINCT] n) Sum of values of n, ignoring NULLs
sum aggregate function [Oracle SQL]:
…if a sum() is created over some numbers, nulls are disregarded, as the following example shows…
That is, you needn't apply NVL to capacity. (But, like with COALESCE in PostgreSQL, you might want to apply it to SUM.)

The thing is, the aggregate always returns a row, even if no rows were aggregated (as is the case in your query). You summed an expression over no rows. Hence the null value you're getting.
Try this instead:
select coalesce(sum(capacity),0)
from objects
where false;

Just do this:
SELECT COALESCE( SUM(capacity), 0)
FROM objects
WHERE null IS NOT NULL;
By the way, COALESCE inside of SUM is redundant, even if capacity is NULL, it won't make the summary null.
To wit:
create table objects
(
capacity int null
);
insert into objects(capacity) values (1),(2),(NULL),(3);
select sum(capacity) from objects;
That will return a value of 6, not null.
And a coalesce inside an aggregate function is a performance killer too, as your RDBMS engine cannot just rip through all the rows, it has to evaluate each row's column if its value is null. I've seen a bit OCD query where all the aggregate queries has a coalesce inside, I think the original dev has a symptom of Cargo Cult Programming, the query is way very sloooowww. I removed the coalesce inside of SUM, then the query become fast.

Although this post is very old, but i would like to update what I use in such cases
SELECT NVL(SUM(NVL(capacity, 0)),0)
FROM objects
WHERE false;
Here external NVL avoids the cases when there is no row in the result set. Inner NVL is used for null column values, consider the case of (1 + null) and it will result in null. So inner NVL is also necessary other wise in alternate set default value 0 to the column.

Related

Null value with sum aggregate?

A query performs a sum aggregate over a single column of a table with 10 tuples. If exactly one of tuples has a NULL value on that column, which of the following will happen?
The query will return NULL.
The query will return the sum of the remaining 9 values.
The query will throw an exception.
Would this be 3?
Aggregate functions ignore null values. That's how their behaviour is defined.
So the answer to your question is: 2)
You can easily test that yourself:
create table test_null(value integer);
insert into test_null
values (1),(1),(1),(1),(1),(1),(1),(1),(1),(null);
select sum(value)
from test_null;
Returns
sum
---
9
The "ignoring" part is more obvious when you use the avg() aggregate function. The result for the above test data is 1, not 0.9 as one might think. That's because aggregates ignore the rows with null values and therefor the average is computed as 9/9.
select avg(value)
from test_null;
is equivalent to:
select avg(value)
from test_null
WHERE value IS NOT NULL;
Online example: http://rextester.com/QQREJS70393

How to conditionally aggregate a column based on another

a row of my data looks like this
[someId, someBool, someInt]
I'm looking for a way to aggregate someInt (to put them in an array specifically).
I use a GROUP BY clause to group by the someId field, then I can aggregate all the someInt using ARRAY_AGG but I only want to include rows where someBool=TRUE. How to approach this the right way ?
PS: It might be relevant to note what I got several booleans like someBool and would like to output to a different array each time
You can use ARRAY_AGG with IGNORE NULLS, e.g.:
ARRAY_AGG(IF(someBool IS NOT TRUE, NULL, someId) IGNORE NULLS)
This will only aggregate the IDs for which someBool is true. If you have multiple boolean columns that you want to use in the condition, you can AND them together or use a CASE WHEN ... or whatever other kind of condition you want that produces NULL in order to exclude a value.

SELECT * FROM Employees WHERE NULL IS NULL; SELECT * FROM Employees WHERE NULL = NULL;

I have recently started learning oracle and sql.
While learning I encountered a couple of queries which my friend was asked in an interview.
SELECT *
FROM Employees
WHERE NULL IS NULL;
this query yields all the rows in the Employees table.
As for as I have understood Oracle searches data in columns, so, NULL, is it treated as a column name here?
Am I correct when I say that Oracle searches for data in columns?
How come Oracle gives all the rows in this query?
In WHERE clause, is it not must that the left hand side of a condition be a COLUMN NAME?
Shouldn't it throw an error?
SELECT *
FROM Employees
WHERE NULL = NULL;
gives NO ROWS SELECTED.
Well, I understand that I can not compare a NULL value using operators except IS NULL and IS NOT NULL.
But why should it yield a result and not an error.
Could somebody explain me this.
Does Oracle treat NULL as a column as well as empty cells?
A where clause consists of conditional expressions. There is no requirement that a conditional expression consist of a column name on either side. In fact, although usually one or both sides are columns, it is not uncommon to have expressions that include:
subqueries
parameters
constants
scalar functions
One common instance is:
WHERE 1 = 1 AND . . .
This is a sign of automatically generated code. It is easier for some programmers to knit together conditions just by including AND <condition> but the clause needs an anchor. Hence, 1 = 1.
The way the WHERE clause works conceptually is that the clause is evaluated for each row produced by the FROM. If the clause evaluates to TRUE, then the row is kept in the result set (or for further processing). If it is FALSE or NULL, then the row is filtered out.
So, NULL IS NULL evaluates to TRUE, so all rows are kept. NULL = NULL evaluates to NULL, so no rows are kept.
NULL is NULL is always true, NULL = NULL is always false. Also, you aren't testing against any columns in either query (thus you are only going to get everything or nothing).

How to write sqlite select statement for NULL records

I have a column which contains null values in some of the rows.
I want to do sum of the column values by writing a select statement in sqlite.
How do I write the statement so that it treats null values as 0.
My current sqlite statement: select sum(amount) from table1
gives error as it returns null.
Please help.
Accordgin to SQLite documentation:
The sum() and total() aggregate functions return sum of all non-NULL values in the group. If there are no non-NULL input rows then sum() returns NULL but total() returns 0.0. NULL is not normally a helpful result for the sum of no rows but the
So I guess you'd better use total() instead of sum() in case you expect that the result be 'non-null'.
You can use ifnull(x,y) or coalesce(x,y,z,...) function. Each of them return the first non-null value from the parameter list from left to right. ifnull has exactly two parameter whereas coalesce has at least two.
select sum(ifnull(amount, 0)) from table1

Invalid Number Error! Can't seem to get around it

Oracle 10g DB. I have a table called s_contact. This table has a field called person_uid. This person_uid field is a varchar2 but contains valid numbers for some rows and in-valid numbers for other rows. For instance, one row might have a person_uid of '2-lkjsdf' and another might be 1234567890.
I want to return just the rows with valid numbers in person_uid. The SQL I am trying is...
select person_uid
from s_contact
where decode(trim(translate(person_uid, '1234567890', ' ')), null, 'n', 'c') = 'n'
The translate replaces all numbers with spaces so that a trim will result in null if the field only contained numbers. Then I use a decode statement to set a little code to filter on. n=number, c=char.
This seems to work when I run just a preview, but I get an 'invalid number' error when I add a filter of...
and person_uid = 100
-- or
and to_number(person_uid) = 100
I just don't understand what is happening! It should be filtering out all the records that are invalid numbers and 100 is obviously a number...
Any ideas anyone? Greatly Appreciated!
Unfortunately, the various subquery approaches that have been proposed are not guaranteed to work. Oracle is allowed to push the predicate into the subquery and then evaluate the conditions in whatever order it deems appropriate. If it happens to evaluate the PERSON_UID condition before filtering out the non-numeric rows, you'll get an error. Jonathan Gennick has an excellent article Subquery Madness that discusses this issue in quite a bit of detail.
That leaves you with a few options
1) Rework the data model. It's generally not a good idea to store numbers in anything other than a NUMBER column. In addition to causing this sort of issue, it has a tendency to screw up the optimizer's cardinality estimates which leads to less than ideal query plans.
2) Change the condition to specify a string value rather than a number. If PERSON_UID is supposed to be a string, your filter condition could be PERSON_UID = '100'. That avoids the need to perform the implicit conversion.
3) Write a custom function that does the string to number conversion and ignores any errors and use that in your code, i.e.
CREATE OR REPLACE FUNCTION my_to_number( p_arg IN VARCHAR2 )
RETURN NUMBER
IS
BEGIN
RETURN to_number( p_arg );
EXCEPTION
WHEN others THEN
RETURN NULL;
END;
and then my_to_number(PERSION_UID) = 100
4) Use a subquery that prevents the predicate from being pushed. This can be done in a few different ways. I personally prefer throwing a ROWNUM into the subquery, i.e. building on OMG Ponies' solution
WITH valid_persons AS (
SELECT TO_NUMBER(c.person_uid) 'person_uid',
ROWNUM rn
FROM S_CONTACT c
WHERE REGEXP_LIKE(c.personuid, '[[:digit:]]'))
SELECT *
FROM valid_persons vp
WHERE vp.person_uid = 100
Oracle can't push the vp.person_uid = 100 predicate into the subquery here because doing so would change the results. You could also use hints to force the subquery to be materialized or to prevent predicate pushing.
Another alternative is to combine the predicates:
where case when translate(person_uid, '1234567890', ' ')) is null
then to_number(person_uid) end = 100
When you add those numbers to the WHERE clause it's still doing those checks. You can't guarantee the ordering within the WHERE clause. So, it still tries to compare 100 to '2-lkjsdf'.
Can you use '100'?
Another option is to apply a subselect
SELECT * FROM (
select person_uid
from s_contact
where decode(trim(translate(person_uid, '1234567890', ' ')), null, 'n', 'c') = 'n'
)
WHERE TO_NUMBER(PERSON_UID) = 100
Regular expressions to the rescue!
where regexp_like (person_uid, '^[0-9]+$')
Use the first part of your query to generate a temp table. Then query the temp table based on person_uid = 100 or whatever.
The problem is that oracle tries to convert each person_uid to an int as it gets to it due to the additional and statement in your where clause. This behavior may or may not show up in the preview depending on what records where picked.