PostgreSQL GREATEST with NULLs - sql

Using PostgreSQL, I am looking for something like SELECT GREATEST(0,x) where x can be NULL. In case x IS NULL, the query should return NULL, similar to MySQL and Google BigQuery and not 0 which is the standard behavior in PostgreSQL. Is there an easy way to accomplish this without cases and conditions?
SELECT GREATEST(0,NULL) should return NULL, not 0
In the official documentation:
The GREATEST and LEAST functions select the largest or smallest value
from a list of any number of expressions. The expressions must all be
convertible to a common data type, which will be the type of the
result (see Section 10.5 for details). NULL values in the list are
ignored. The result will be NULL only if all the expressions evaluate
to NULL.
Note that GREATEST and LEAST are not in the SQL standard, but are a
common extension. Some other databases make them return NULL if any
argument is NULL, rather than only when all are NULL.
https://www.postgresql.org/docs/9.6/functions-conditional.html
I'm looking for a GREATEST function that does not ignore NULLs

You can write your own function:
create function strange_greatest(variadic p_input int[])
returns int
as
$$
select v
from unnest(p_input) as t(v)
order by t desc nulls first
limit 1;
$$
language sql
immutable;
postgres=> select strange_greatest(1,2,4);
strange_greatest
----------------
4
(1 row)
postgres=> select strange_greatest(1,2,null,4);
strange_greatest
----------------
(1 row)

You could add another expression. For numbers:
select greatest(a, b, c) + (a + b + c - (a + b + c))
This is a bit more challenging for other data types. But arrays can help:
select greatest(a, b, c) * nullif( array_position(array[a, b, c], null) is not null), true )::int

For the case of 2 numbers (columns), if your table is like this:
create table tablename(a int, b int);
insert into tablename(a, b) values
(10, 20), (null, 30), (40, null);
then use the function greatest() like this:
select
greatest((a + b) - b, (a + b) - a) "greatest"
from tablename;
If a or b is null then both expressions: (a + b) - b and (a + b) - a are null and the function greatest() will return null.
See the demo.
Results:
| greatest |
| -------- |
| 20 |
| null |
| null |

Related

How to find records in PostgreSQL matching a combination of a pair of nullable String columns

Suppose a PostgreSQL table, articles, contains two nullable String columns of name and alt_name.
Now, I want to find records (rows) in the table that have
a combination of String name and alt_name matches another combination of the same type in the same table:
i.e., [a.name, a.alt_name] is equal to either [b.name, b.alt_name] or [b.alt_name, b.name]
where name or alt_name may be NULL or an empty String, and in any circumstances NULL and an empty String should be treated as identical;
e.g., when [a.name, a.alt_name] == ["abc", NULL], a record of [b.name, b.alt_name] == ["", "abc"] should match, because one of them is "abc" and the other is NULL or empty String.
Is there any neat query to achieve this?
I thought if there is a way to concatenate both columns with a UTF-8 replacement character (U+FFFD) in between, where NULL is converted into an empty String, that would solve the problem. Say, if the function were magic_fn(), the following would do a job, providing there is a unique column id:
SELECT * FROM articles a INNER JOIN places b ON a.id <> b.id
WHERE
magic_fn(a.name, a.alt_name) = magic_fn(b.name, b.alt_name)
OR magic_fn(a.name, a.alt_name) = magic_fn(b.alt_name, b.name);
-- [EDIT] corrected from the original post, which was simply wrong.
However, concatnation is not a built-in function in PostgreSQL and I don't know how to do this.
[EDIT] As commented by #Serg and in answers, a string-concatnation function is now available in PostgreSQL from Ver.9.1 (CONCAT or ||); n.b., it actually accepts non-String input as long as one of them is a String-type as of Ver.15.
Or, maybe there is simply a better way?
You can create a function which takes in the name and alt_name, then returns an aggregated string with nulls converted to empty strings and the results sorted:
create function magic_fn(a text, b text) returns text
return (select json_agg(t.v) from (
select t1.* from (
select coalesce(a, '') v
union all
select coalesce(b, '') v) t1
order by t1.v) t);
create table articles (id int, name text, alt_name text);
insert into articles values (1, 'abc', null), (2, 'abc', ''), (3, null, 'abc'), (4, 'aaa', 'a'), (5, 'aaa', 'a'), (6, 'a', 'aaa')
Usage:
select * from articles a join articles b
on a.id <> b.id and magic_fn(a.name, a.alt_name) = magic_fn(b.name, b.alt_name)
See fiddle
try this
SELECT * FROM articles a
cross join articles b
where
(ARRAY[COALESCE(a.name,''),COALESCE(a.alt_name,'')] #> ARRAY[COALESCE(b.name,''),COALESCE(b.alt_name,'')])
and (ARRAY[COALESCE(a.name,''),COALESCE(a.alt_name,'')] <# ARRAY[COALESCE(b.name,''),COALESCE(b.alt_name,'')])
and a.id<>b.id
and a.id<b.id --optional (to avoid reverse matching)
db<>fiddle
Having reviewed a few answers (special thanks to #MitkoKeckaroski), I have come up with this short solution. COALESCE() is not necessary!
The condition is that the UTF replacement character (\U+FFFD) should never appear in the data record, which you can safely assume according to the Unicode specification.
SELECT * FROM articles a JOIN articles b
ON a.id <> b.id AND
ARRAY[CONCAT(a.name, U&'\FFFD', a.alt_name),
CONCAT(a.alt_name, U&'\FFFD', a.name)] #>
ARRAY[CONCAT(b.name, U&'\FFFD', b.alt_name)];
See db<>fiddle (where I extended the data prepared by #Ajax1234 – thank you!)
you can try to use
coalesce for convert null to empty
|| for concatenate string
and then compare string like this sql:
(coalesce(a.name,'') || coalesce(a.altname,'')) = (coalesce(b.name,'') || coalesce(b.altname,''))
or
(coalesce(a.name,'') || coalesce(a.altname,'')) = (coalesce(b.altname,'') || coalesce(b.name,''))
You can create an array from both names, remove null and empty values, then check if the arrays overlap (have elements in common)
select *
from articles
where array_remove(array[nullif(name,''), nullif(alt_name,'')], null) && array['abc']
This can be made easier by creating a function that generates such an array:
create or replace function combine_names(p_names variadic text[])
returns text[]
as
$$
select array_agg(name)
from unnest(p_names) as x(name)
where nullif(trim(name),'') is not null;
$$
language sql
immutable
called on null input;
By making the parameter variadic it's possible to provide a different number of arguments (in theory even more than two)
select *
from articles
where combine_names(name, alt_name) && combine_names('abc')
select *
from articles
where combine_names(name, alt_name) && combine_names('abc', null)
select *
from articles
where combine_names(name, alt_name) && combine_names('abc', 'def')

How to update field value by determining each digit in field?

Although I saw update statements to update field based on existing values, I could not find anything similar to this scenario:
Suppose you have a table with only one column of number(4) type. The value in the first record is 1010.
create table stab(
nmbr number(4)
);
insert into stab values(1010);
For each digit
When the digit is 1 -- add 3 to the digit
When the digit is 0 -- add four to the digit
end
This operations needs to be completed in a single statement without using pl/sql.
I think substr function need to be used but don't know how to go about completing this.
Thanks in advance.
SELECT DECODE(SUBSTR(nmbr,1,1), '1', 1 + 3, '0', 0 + 4) AS Decoded_Nmbr
FROM stab
ORDER BY Decoded_Nmbr
Is that what you are after?
So, it seems you need to convert every 0 and 1 to a 4, and leave all the other digits alone. This seems like a string operation (and the reference to "digits" itself suggests the same thing). So, convert the number to a string, use the Oracle TRANSLATE function (see the documentation), and convert back to number.
update stab
set nmbr = to_number(translate(to_char(nmbr, '9999'), '01', '44'))
;
assuming its always a 4 digit #; you could use substring like below
-- postgres SQL example
SELECT CASE
WHEN a = 0 THEN a + 4
ELSE a + 3
end AS a,
CASE
WHEN b = 0 THEN b + 4
ELSE b + 3
end AS b,
CASE
WHEN c = 0 THEN c + 4
ELSE c + 3
end AS c,
CASE
WHEN d = 0 THEN d + 4
ELSE c + 3
end AS d
FROM ( SELECT Substr( '1010', 1, 1 ) :: INT AS a,
Substr( '1010', 2, 1 ) :: INT b,
Substr( '1010', 3, 1 ) :: INT c,
Substr( '1010', 4, 1 ) :: INT d )a
--- Other option may be (tried in postgreSQL :) ) to split the number using regexp_split_to_table into rows;then add individual each digit based on the case statement and then concat the digits back into a string
SELECT array_to_string ( array
(
select
case
WHEN val = 0 THEN val +4
ELSE val +3
END
FROM (
SELECT regexp_split_to_table ( '101010','' ) ::INT val
) a
) ,'' )
My answer to the interview question would have been that the DB design violates the rules of normalization (i.e. a bad design) and would not have this kind of "update anomaly" if it were properly designed. Having said that, it can easily be done with an expression using various combinations of single row functions combined with the required arithmetic operations.

How to aggragate integers in postgresql?

I have a query that gives list of IDs:
ID
2
3
4
5
6
25
ID is integer.
I want to get that result like that in ARRAY of integers type:
ID
2,3,4,5,6,25
I wrote this query:
select string_agg(ID::text,',')
from A
where .....
I have to convert it to text otherwise it won't work. string_agg expect to get (text,text)
this works fine the thing is that this result should later be used in many places that expect ARRAY of integers.
I tried :
select ('{' || string_agg(ID::text,',') || '}')::integer[]
from A
WHERE ...
which gives: {2,3,4,5,6,25} in type int4 integer[]
but this isn't the correct type... I need the same type as ARRAY.
for example SELECT ARRAY[4,5] gives array integer[]
in simple words I want the result of my query to work with (for example):
select *
from b
where b.ID = ANY (FIRST QUERY RESULT) // aka: = ANY (ARRAY[2,3,4,5,6,25])
this is failing as ANY expect array and it doesn't work with regular integer[], i get an error:
ERROR: operator does not exist: integer = integer[]
note: the result of the query is part of a function and will be saved in a variable for later work. Please don't take it to places where you bypass the problem and offer a solution which won't give the ARRAY of Integers.
EDIT: why does
select *
from b
where b.ID = ANY (array [4,5])
is working. but
select *
from b
where b.ID = ANY(select array_agg(ID) from A where ..... )
doesn't work
select *
from b
where b.ID = ANY(select array_agg(4))
doesn't work either
the error is still:
ERROR: operator does not exist: integer = integer[]
Expression select array_agg(4) returns set of rows (actually set of rows with 1 row). Hence the query
select *
from b
where b.id = any (select array_agg(4)) -- ERROR
tries to compare an integer (b.id) to a value of a row (which has 1 column of type integer[]). It raises an error.
To fix it you should use a subquery which returns integers (not arrays of integers):
select *
from b
where b.id = any (select unnest(array_agg(4)))
Alternatively, you can place the column name of the result of select array_agg(4) as an argument of any, e.g.:
select *
from b
cross join (select array_agg(4)) agg(arr)
where b.id = any (arr)
or
with agg as (
select array_agg(4) as arr)
select *
from b
cross join agg
where b.id = any (arr)
More formally, the first two queries use ANY of the form:
expression operator ANY (subquery)
and the other two use
expression operator ANY (array expression)
like it is described in the documentation: 9.22.4. ANY/SOME
and 9.23.3. ANY/SOME (array).
How about this query? Does this give you the expected result?
SELECT *
FROM b b_out
WHERE EXISTS (SELECT 1
FROM b b_in
WHERE b_out.id = b_in.id
AND b_in.id IN (SELECT <<first query that returns 2,3,4,...>>))
What I've tried to do is to break down the logic of ANY into two separate logical checks in order to achieve the same result.
Hence, ANY would be equivalent with a combination of EXISTS at least one of the values IN your list of values returned by the first SELECT.

SQL minus 2 columns - with null values

I have this table (made from a SQL query):
Row 1 Row 2
2 1
3 NULL
And I want to minus the 2 columns, so I just select like this:
Select Row1 - Row2
From table
But then I get this result:
1
NULL
instead of:
1
3
How can I make it possible to get the last result?
Please try:
SELECT ISNULL([Row 1], 0) - ISNULL([Row 2], 0) from YourTable
For more Information visit ISNULL
The reason you got this is because Any Mathematical operation with NULL produces NULL So while doing operation all values should be read as NULL=0.
With ISNULL()
Hence
SELECT ISNULL([Row 1], 0) - ISNULL([Row 2], 0) from YourTable
The MySQL equivalent of ISNULL is IFNULL
If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns
expr2.
Maybe also look at SQL NULL Functions
The ISNULL from MySQL is used to check if a value is null
If expr is NULL, ISNULL() returns 1, otherwise it returns 0.
in sql anything minus with NULL then it is always NULL so you need to convert NULL to Zero
SELECT ISNULL(ROW1,0)-ISNULL(ROW2,0) FROM YOUR_TABLE
Select Row1 - COALESCE(Row2,0)
From table

How to find least non-null column in one particular row in SQL?

I am trying to find the lowest number in two columns of a row in the same table, with the caveat that one of the columns may be null in a particular row. If one of the columns is null, I want the value in the other column returned for that row, as that is the lowest non-null column in this case. If I use the least() function in MySQL 5.1:
select least(1,null)
This returns null, which is not what I want. I need the query to return 1 in this case.
I've been able to get the result I want in general with this query:
select least(coalesce(col1, col2)) , coalesce(col2,col1))
As long as col1 and col2 are both not null each coalesce statement will return a number, and the least() handles finding the lowest.
Is there a simpler/faster way to do this? I'm using MySQL in this instance but general solutions are welcomed.
Unfortunately (for your case) behaviour of LEAST was changed in MySQL 5.0.13 (http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html#function_least) - it used to return NULL only if all arguments are NULL.
This change was even reported as a bug: http://bugs.mysql.com/bug.php?id=15610
But the fix was only to MySQL documentation, explaining new behaviour and compatibility break.
Your solution was one of the recommended workarounds. Another can be using IF operator:
SELECT IF(Col1 IS NULL OR Col2 IS NULL, COALESCE(Col1, Col2), LEAST(Col1,Col2))
Depending on your corner case situation of having all values be null, I would go for such syntax, which is more readable (An easier solution if you have exactly two columns is below!)
SELECT LEAST( IFNULL(5, ~0 >> 1), IFNULL(10, ~0 >> 1) ) AS least_date;
-- Returns: 5
SELECT LEAST( IFNULL(null, ~0 >> 1), IFNULL(10, ~0 >> 1) ) AS least_date;
-- Returns: 10
SELECT LEAST( IFNULL(5, ~0 >> 1), IFNULL(null, ~0 >> 1) ) AS least_date;
-- Returns: 5
SELECT LEAST( IFNULL(null, ~0 >> 1), IFNULL(null, ~0 >> 1)) AS least_date
-- Returns: #MAX_VALUE (If you need to use it as default value)
SET #MAX_VALUE=~0 >> 1;
SELECT LEAST( IFNULL(null, #MAX_VALUE), IFNULL(null, #MAX_VALUE)) AS least_date;
-- Returns: #MAX_VALUE (If you need to use it as default value). Variables just makes it more readable!
SET #MAX_VALUE=~0 >> 1;
SELECT NULLIF(
LEAST( IFNULL(null, #MAX_VALUE), IFNULL(null,#MAX_VALUE)),
#MAX_VALUE
) AS least_date;
-- Returns: NULL
That is my prefered way if
you can ensure that at least one column cannot be NULL
in corner case situation (all columns are NULL) you want a non-null default value which greater than any possible value or can get limited to a certain threshold
You can deal with variables to make this statement even more readable
If you question yourself what ~0 >> 1 means:
It's just a short hand for saying "Give me the greatest number available". See also: https://stackoverflow.com/a/2679152/2427579
Even better, if you have only two columns, you can use:
SELECT LEAST( IFNULL(#column1, #column2), IFNULL(#column2, #column1) ) AS least_date;
-- Returns: NULL (if both columns are null) or the least value
This is how I solved it:
select coalesce(least(col1, col2), col1, col2)
If one value is NULL, the query will return the first non-NULL value. You can even add a default value as the last parameter, if both values can be NULL.
This may perform a bit better (may have to be converted to corresponding MySql syntax):
SELECT
CASE
WHEN Col1 IS NULL THEN Col2
WHEN Col2 IS NULL THEN Col1
ELSE Least(Col1, Col2)
END
Another alternative (probably slower though, but worth a try):
SELECT Col1
WHERE Col2 IS NULL
UNION
SELECT Col2
WHERE Col1 IS NULL
UNION
SELECT least(Col1, Col2)
WHERE Col1 IS NOT NULL AND Col2 IS NOT NULL
Why not set the value of one column to be equal to the other column when it's NULL?
SELECT LEAST(IFNULL(COL1, COL2), IFNULL(COL2, COL1));
with the code above, the null value will be ignored unless both are null.
e.g.
COL1 = NULL, COL2 = 5
LEAST(IFNULL(NULL, 5), IFNULL(5, NULL)) -> LEAST(5, 5) -> 5
COL1 = 3, COL2 = NULL
LEAST(IFNULL(3, NULL), IFNULL(NULL, 3)) -> LEAST(3, 3) -> 3
COL1 = NULL, COL2 = NULL
LEAST(IFNULL(NULL, NULL), IFNULL(NULL, NULL)) -> LEAST(NULL, NULL) -> NULL
SELECT
MIN(LEAST(COALESCE(COL1, COL2), COALESCE(COL2,CO1)))
WHERE COL1 IS NOT NULL
AND COL2 IS NOT NULL;
I've created a function which handles any number of dates, by concatenating them with a separator (CONCAT_WS) as first parameter to the function.
CONCAT_WS besides dynamic number of parameters, will remove all NULL dates ;)
The function accepts two parameters:
delimiter separated string of dates as TEXT
delimiter as TEXT (same as used on CONCAT_WS !!) - you can remove it if you use only preferred separator on CONCAT_WS.
CREATE FUNCTION `min_date`(`dates` TEXT, `delim` VARCHAR(10)) RETURNS DATE NO SQL DETERMINISTIC
BEGIN
DECLARE `result` DATE DEFAULT NULL;
DECLARE `count` TINYINT DEFAULT 0;
DECLARE `temp` DATE DEFAULT NULL;
IF `delim` IS NULL THEN SET `delim` = ','; END IF;
IF `dates` IS NOT NULL AND CHAR_LENGTH(`dates`) > 0 THEN
SET `count` = LENGTH(`dates`) - LENGTH(REPLACE(`dates`, `delim`, SPACE(CHAR_LENGTH(`delim`) - 1)));
WHILE `count` >= 0 DO
SET `temp` = SUBSTRING_INDEX(SUBSTRING_INDEX(`dates`, `delim`, `count` + 1), `delim`, -1);
IF `result` IS NULL OR `result` > `temp` THEN SET `result` = `temp`; END IF;
SET `count` = `count` - 1;
END WHILE;
END IF;
RETURN `result`;
END
Then, you can use in any combination of date fields or as static strings (as long as are valid dates or NULL):
SELECT min_date(CONCAT_WS(',', `date_column_1`, NULL, '2019-03-04', `date_column_2`), ',') AS `min_date`
One simple (yet not beautiful) solution is the following.
If you're looking for the smallest non-null value, you can use IFNULL with the second parameter beingthe 'INT limit'
ORDER BY LEAST(
IFNULL(properties.sale_value, 2147483647),
IFNULL(properties.rental_value, 2147483647),
IFNULL(properties.daily_rental_value, 2147483647)
) ASC
And if you're looking for the biggest non-null value, you can use IFNULL with the second parameter being 1, ( or the first negative value below your limit, if you don't know it, use the negative int limit )
ORDER BY GREATEST(
IFNULL(properties.sale_value, 1),
IFNULL(properties.rental_value, 1),
IFNULL(properties.daily_rental_value, 1)
) ASC