I have a table TEST_TABLE as follows:
Name x_col y_col
=======================
Jay NULL 2
This is a simplistic representation of a much larger issue but will suffice.
When I do the following query I get NULL returned
SELECT SUM(x_col + y_col) FROM TEST_TABLE WHERE Name='Jay'
I want it to be 2. I thought the SUM() method ignores NULL values. How can I ignore values that are null in this query? Or actually in general, as this is a problem for a lot of my algorithms.
You get NULL because NULL + 2 returns NULL. The SUM() has only one row, and if the + expression is NULL, then the SUM() returns NULL.
If you want NULL to be treated as 0, the use COALESCE():
SELECT SUM(COALESCE(x_col, 0) + COALESCE(y_col, 0))
FROM TEST_TABLE
WHERE Name = 'Jay';
One final note. If you start with your data and filtered out all rows, then the result will still be NULL. To get 0, you need an additional COALESCE():
SELECT COALESCE(SUM(COALESCE(x_col, 0) + COALESCE(y_col, 0)), 0)
FROM TEST_TABLE
WHERE Name = 'Jayden';
Use COALESCE to replace NULL with 0.
SELECT sum(coalesce(x_col, 0) + coalesce(y_col, 0)) FROM TEST_TABLE WHERE Name='Jay'
I'm attempting to migrate postgres scripts over to bigquery, with the end goal of both scripts returning the exact same tables (schema and values).
I'm running into an issue when trying to replicate the behavior of least() in postgres in my bigquery selects.
In postgres, if any parameters of the least() call are null, they are skipped and the least non-null value is returned. In bigquery, however, if any of the parameters of the least() call are null, the function automatically returns null.
I'm looking for an elegant solution to replicate the postgres least() behavior in bigquery. My current—clunky—solution is below:
Postgres (returns -1):
SELECT LEAST(1, 0, -1, null)
BigQuery (returns null):
SELECT LEAST(1, 0, -1, null)
Postgres (returns -1):
SELECT LEAST(COALESCE(1, 0, -1, null),
COALESCE(0, 1, -1, null),
COALESCE(-1, 0, 1, null),
COALESCE(null, 0, -1, 1))
BigQuery (returns -1):
SELECT LEAST(COALESCE(1, 0, -1, null),
COALESCE(0, 1, -1, null),
COALESCE(-1, 0, 1, null),
COALESCE(null, 0, -1, 1))
This works but is a less-than-ideal solution.
In the original postgres script I need to migrate, there is nested logic like least(w, x, least(y, z)) so that fix gets exponentially more unreadable as the number of values/complexity grows. That same issue applies when you try to do this as a massive CASE block.
If anyone has an obvious fix that I'm missing or a more elegant way to mirror the postgres behavior in bigquery, it is much appreciated!
There is a simple workaround for BigQuery Standard SQL
You just create your own function (let's say myLeast)
It works for "standalone" as well as in nested scenario
#standardSQL
CREATE TEMP FUNCTION myLeast(x ARRAY<INT64>) AS
((SELECT MIN(y) FROM UNNEST(x) AS y));
SELECT
LEAST(1, 0, -1, NULL) AS least_standard,
LEAST(COALESCE(1, 0, -1, NULL),
COALESCE(0, 1, -1, NULL),
COALESCE(-1, 0, 1, NULL),
COALESCE(NULL, 0, -1, 1)) AS least_less_than_ideal,
myLeast([1, 0, -1, NULL]) AS least_workaround,
myLeast([1, 0, -1, NULL, myLeast([2, 0, -2, NULL])]) AS least_with_nested
Output is
least_standard least_less_than_ideal least_workaround least_with_nested
null -1 -1 -2
first two is from your question - third and forth are "standalone" and nested workaround
Hope you can apply this approach to your specific case
Both Oracle and Vertica behave the same as BigQuery, following general rule of SQL functions - if one of the arguments is NULL - the result is NULL. PostgreSQL makes an exception to that rule, explicitly stating in documentation:
The result will be NULL only if all the expressions evaluate to NULL.
Note that GREATEST and LEAST are not in the SQL standard, but are a
common extension. Some other databases make them return NULL if any
argument is NULL, rather than only when all are NULL.
I would open Feature Request in BigQuery issue tracker to add IGNORE NULLS parameter to LEAST and GREATEST to get PostgreSQL compatible behavior. Even though normally IGNORE NULLS only applies to aggregate functions, LEAST and GREATEST are kind of similar to aggregate functions.
Without a function:
select
(select min(col) from unnest([a,b,c,d,e]) col) least,
(select max(col) from unnest([a,b,c,d,e]) col) greatest,
*
from
(
select 1 a, 2 b, 3 c, null d, 5 e
union all
select null a, null b, null c, null d, null e
) tbl
Maybe something like this could work?
WITH tbl AS(
SELECT 1 AS a, 2 AS b
UNION ALL SELECT NULL, 2
UNION ALL SELECT 1, NULL
UNION ALL SELECT NULL, NULL
)
SELECT
tbl.*
, COALESCE( LEAST(a, b), a , b)
FROM tbl
How about this? :) "The Postgres library" :)
DECLARE input STRING DEFAULT (
WITH t AS
(
SELECT 1 a, 0 b, -1 c, null d
UNION ALL
SELECT 0, 1, -1, null
UNION ALL
SELECT -1, 0, 1, null
UNION ALL
SELECT null, 0, -1, 1
)
SELECT '['||STRING_AGG("'"||TO_JSON_STRING(t)||"'")||']' FROM t
)
;
EXECUTE IMMEDIATE '''
SELECT * FROM EXTERNAL_QUERY("project.location.connection",
\'\'\'
SELECT
GREATEST (
(t::json->>'a')::INT,
(t::json->>'b')::INT,
(t::json->>'c')::INT,
(t::json->>'d')::INT
)
FROM
UNNEST (ARRAY '''||input||") AS t ''')"
Of course it may improve with dynamic "body" for GREATEST but I hope nobody will use it.
It's too sad to live in the world with no IGNORE NULLS for GREATEST and LEAST...as well as impossible to direct string variable into EXTERNAL_QUERY 😔 😔 😔
Questions to community
Does anyone know about limitations for this "method"?
How long string is allowed to execute in EXECUTE IMMEDIATE?
as argument for EXTERNAL_QUERY?
In my SSRS report layout, there is a parameter #PositiveOrNegative, with three values Positive, Negative and Both. In my report there is a column A. For examples,
When I select Positive in #PositiveOrNegative, column A will display only positive values.
When I select Negative in #PositiveOrNegative, column A will display only negative values.
When I select Both in #PositiveOrNegative, column A will display all values, regardless it is positive or negative.
Can someone write me a query expression for the situation above?
Maybe something like this:
=Switch(
Parameters!PositiveOrNegative.Value = "Positive" AND Fields!A.Value > 0, Fields!A.Value,
Parameters!PositiveOrNegative.Value = "Negative" AND Fields!A.Value < 0, Fields!A.Value,
Fields!A.Value
)
Set the value expression for the textbox to the following (assuming you want to display 0's as positive):
=IIf
(
(
Parameters!PositiveOrNegative.Value = "Positive" And Fields!Column_A.Value >= 0
Or Parameters!PositiveOrNegative.Value = "Negative" And Fields!Column_A.Value < 0
Or Parameters!PositiveOrNegative.Value = "Both"
),
Fields!Column_A.Value,
Nothing
)
I have a table with about a hundred rows. It has a column is_gallery that contains either 1, 0, or NULL. If I do...
SELECT * WHERE is_gallery != 1
or
SELECT * WHERE NOT (is_gallery = 1)
it excludes the rows where is_gallery is null. I can manage to get a proper response if I do
SELECT * WHERE (is_gallery = 0 OR is_gallery is null)
But shouldn't the "!=" or NOT work? Isn't there a way to just return the rows where is_gallery doesn't equal 1 without testing for every other possibility?
You can use the IS and IS NOT operators instead of = and !=. These treat NULL like a normal value.
SELECT * FROM yourTable WHERE is_gallery IS NOT 1
The best thing to use is coalesce as in:
SELECT *
WHERE coalesce(is_gallery,0) != 1;
what coalesce does, is replaces any null value in that column with the second parameter. In the example above, any nulls in the "is_gallery" column will be replaced with 0 before it is compared with 1. So will of course return true.
On NULL realize that a NULL value isn't equal to ANYTHING - not even NULL itself. It cannot be compared - so when "comparing", it always will return FALSE. On NULL, it has a special operator which is "IS NULL" or "IS NOT NULL"
I have a a query which retrieves 2 times a count from 2 tables.
Now in the same query it has (countresult1-countresult2) AS restresult
Now restresult is sometimes less than 0 (eq -10) but I want it to return 0 if it's under 0.
Uhm did I explan that right? Minimum value should be 0 not below.
Cheers!!!
GREATEST((countresult1-countresult2), 0) AS restresult
if (countresult1<countresult2, 0, countresult1-countresult2) as restresult
neither countresult1 nor countresult2 will return a negative number, so above should be safe
without seeing your query, you could have something like...
MAX( if( countresult1-countresult2 < 0, 0, countresult1-countresult2 )) as YourResult
Take the maximum of 0 and the value you calculated, like this:
SELECT GREATEST(your-restresult-query,0)
FROM ... (etc)
Until Now I didn't know there was if-else commands in SQL, but I found some.
you will want to use:
WHEN (countresult1-countresult2) < 0 THEN 0 ELSE (countresult1-countresult2)
Here is the source where I found the SQL information: http://www.tizag.com/sqlTutorial/sqlcase.php