The parentheses rules of PostgreSQL, is there a summarized guide? - sql

In Mathematics and many programming languages (and I think standard SQL as well), parentheses change precedence (grouping parts to be evaluated first) or to enhance readability (for human eyes).
Equivalent Examples:
SELECT array[1,2] #> array[1]
SELECT (array[1,2]) #> array[1]
SELECT array[1,2] #> (array[1])
SELECT ((array[1,2]) #> (array[1]))
But SELECT 1 = ANY array[1,2] is a syntax error (!), and SELECT 1 = ANY (array[1,2]) is valid. Why?
OK, because "the manual says so". But what the logic for humans to remember all exceptions?
Is there a guide about it?
I do not understand why (expression) is the same as expression in some cases, but not in other cases.
PS1: parentheses are also used as value-list delimiters, as in expression IN (value [, ...]). But an array is not a value-list, and there does not seem to be a general rule in PostgreSQL when (array expression) is not the same as array expression.
Also, I used array as example, but this problem/question is not only about arrays.

"Is there a summarized guide?", well... The answer is no, so: hands-on! This answer is a Wiki, let's write.
Summarized guide
Let,
F() a an usual function. (ex. ROUND)
L() a function-like operator (ex. ANY)
f a operator-like function (ex. current_date)
Op an operator
Op1, Op2 are distinct operators
A, B, C values or expressions
S a expression-list, as "(A,B,C)"
The rules, using these elements, are in the form
rule: notes.
"pure" mathematical expressions
When Op, Op1, Op2 are mathematical operators (ex. +, -. *), and F() is a mathematical function (ex. ROUND()).
Rules for scalar expressions and "pure array expressions":
A Op B = (A Op B): the parentheses is optional.
A Op1 B Op2 C: need to check precedence.
(A Op1 B) Op2 C: enforce "first (A Op1 B)".
A Op1 (B Op2 C): enforce "first (B Op2 C)".
F(A) = (F(A)) = F((A)) = (F((A))): the parentheses are optional.
S = (S): the external parentheses are optional.
f=(f): the parentheses are optional.
Expressions with function-like operators
Rules for operators as ALL, ANY, ROW, SOME, etc.
L(A) = L((A)): the parentheses is optional in the argument.
(L(A)): SYNTAX ERROR.
...More rules? Please help editing here.

ANY is a function-like construct. Like (almost) any other function in Postgres it requires parentheses around its parameters. Makes the syntax consistent and helps the parser avoid ambiguities.
You can think of ANY() like a shorthand for unnest() condensed to a single expression.
One might argue an additional set of parentheses around the set-variant of ANY. But that would be ambiguous, since a list of values in parentheses is interpreted as a single ROW type.

Related

Sending ARRAY to VALUES clause fails

If I want to construct a temporary valueset for testing, I can do something like this:
SELECT * FROM (VALUES (97.99), (98.01), (99.00))
which will result in this:
COLUMN1
1
97.99
2
98.01
3
99.00
However, if I want to construct a result set where one of the columns contains an ARRAY, like this:
SELECT * FROM (VALUES (97.99, [14, 37]), (98.01, []), (99.00, [14]))
I would expect this:
COLUMN1
COLUMN2
1
97.99
[14, 37]
2
98.01
[]
3
99.00
[14]
but I actually get the following error:
Invalid expression [ARRAY_CONSTRUCT(14, 37)] in VALUES clause
I don't see anything in the documentation for the VALUES clause that explains why this is invalid. What am I doing wrong here and how can I generate a result set with an ARRAY column?
I think the values clause only allows primitive types. You can define it as a string in single quotes and use parse_json to turn it into an array:
SELECT $1 COL1, parse_json($2)::array COL2
FROM (VALUES (97.99, '[14, 37]'), (98.01, '[]'), (99.00, '[14]'));
VALUES() has some restrictions:
Each expression must be a constant, or an expression that can be evaluated as a constant during compilation of the SQL statement.
Most simple arithmetic expressions and string functions can be evaluated at compile time, but most other expressions cannot.
https://docs.snowflake.com/en/sql-reference/constructs/values.html
From the documentation
Each expression must be a constant, or an expression that can be
evaluated as a constant during compilation of the SQL statement.
Most simple arithmetic expressions and string functions can be
evaluated at compile time, but most other expressions cannot.
The documentation doesn't explicitly says this, but given the ability of arrays to hold multiple data types and varying number of elements, I want to say arrays in most SQL based databases are dynamic arrays that don't evaluate at compile time. Maybe some experts can shed more light on this.
Back to your problem, I would just use explicit select statements like:
select 97.99, [14, 37] union all
select 98.01, [];

PostgreSQL =ANY and IN [duplicate]

This question already has an answer here:
How to use ANY instead of IN in a WHERE clause?
(1 answer)
Closed 5 years ago.
Recently I've read Quantified Comparison Predicates – Some of SQL’s Rarest Species:
In fact, the SQL standard defines the IN predicate as being just syntax sugar for the = ANY() quantified comparison predicate.
8.4 <in predicate>
Let RVC be the <row value predicand> and
let IPV be the <in predicate value>.
The expression RVC IN IPV
is equivalent to RVC = ANY IPV
Fair enough, based on other answers like: What is exactly “SOME / ANY” and “IN” or Oracle: '= ANY()' vs. 'IN ()'
I've assumed that I could use them interchangely.
Now here is my example:
select 'match'
where 1 = any( string_to_array('1,2,3', ',')::int[])
-- match
select 'match'
where 1 IN ( string_to_array('1,2,3', ',')::int[])
-- ERROR: operator does not exist: integer = integer[]
-- HINT: No operator matches the given name and argument type(s).
-- You might need to add explicit type casts.
DB Fiddle
The question is why the first query is working and the second returns error?
That's because IN (unlike ANY) does not accept an array as input. Only a set (from a subquery) or a list of values. Detailed explanation:
How to use ANY instead of IN in a WHERE clause with Rails?

About sql and logic. In the sql where clause, is "not (p and q)" equal to "(not p) or (not q)"

A SQL and logic problem. In the where clause, is
not (p and q)
equal to
(not p) or (not q)
Yes. De Morgan's laws are language-independent.
Refer the working fiddle:
Query 1: not (p and q)
select * from table1
where
!(p = 1 and q=1);
Query 2 : (not p) or (not q)
select * from table1
where p!=1 or q!=1;
There is no difference in the output and hence the boolean algebra logic !(p and Q) = (!p) or (!q) is true!!!
Though a bit late answer but what you are talking about is De Morgan's Law here. So your logic not (p and q) will get converted to
not p or not q
Cause Negation (not) will apply to to the statement (p and q)
not p
not and will get converted to or
not q
Although the two expressions are logically equivalent, they may not be functionally equivalent. It depends on the nature of p and q and the operation of the language's optimiser.
Consider, for instance, that p is false. In the case of (not p) or (not q), we can deduce that the expression is true without having to evaluate q. A clever optimiser that understands or might take a short cut like that. But we cannot do so in the case of not (p and q) (unless our theorised optimiser could itself apply de Morgan first).
Does anyone know if SQL Server or Oracle or the other major players does this type of optimisation?
The result may not just be a performance saving. Suppose the q is not just a simple boolean variable but some expression that includes the execution of some more complex function. If that function has side-effects other than returning a truth value, then by optimising-out the evaluation of q we would also not see those side effects.

Postgresql using the VARIADIC on a query inside a function

I would like to create a function on postgresql that receives an array of bigint (record ids) and to use the received information on a query using the "in" condition.
I know that I could simply fo the query by my self, but the point over here is that I'm going to create that function that will do some other validations and processes.
The source that I was tring to use was something like this:
CREATE OR REPLACE FUNCTION func_test(VARIADIC arr bigint[])
RETURNS TABLE(record_id bigint,parent_id bigint)
AS $$ SELECT s.record_id, s.parent_id FROM TABLE s WHERE s.column in ($1);
$$ LANGUAGE SQL;
Using the above code I receive the following error:
ERROR: operator does not exist: bigint = bigint[]
LINE 3: ...ECT s.record_id, s.parent_id FROM TABLE s WHERE s.column in ($1)
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
How can I fix this?
IN doesn't work with arrays the way you think it does. IN wants to see either a list of values
expression IN (value [, ...])
or a subquery:
expression IN (subquery)
A single array will satisfy the first one but that form of IN will compare expression against each value using the equality operator (=); but, as the error message tells you, there is no equality operator that can compare a bigint with a bigint[].
You're looking for ANY:
9.23.3. ANY/SOME (array)
expression operator ANY (array expression)
expression operator SOME (array expression)
The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ANY is "true" if any true result is obtained. The result is "false" if no true result is found (including the case where the array has zero elements).
So you want to say:
WHERE s.column = any ($1)
Also, you're not using the argument's name so you don't need to give it one, just this:
CREATE OR REPLACE FUNCTION func_test(VARIADIC bigint[]) ...
will be sufficient. You can leave the name there if you want, it won't hurt anything.

What applications are there for NULLIF()?

I just had a trivial but genuine use for NULLIF(), for the first time in my career in SQL. Is it a widely used tool I've just ignored, or a nearly-forgotten quirk of SQL? It's present in all major database implementations.
If anyone needs a refresher, NULLIF(A, B) returns the first value, unless it's equal to the second in which case it returns NULL. It is equivalent to this CASE statement:
CASE WHEN A <> B OR B IS NULL THEN A END
or, in C-style syntax:
A == B || A == null ? null : A
So far the only non-trivial example I've found is to exclude a specific value from an aggregate function:
SELECT COUNT(NULLIF(Comment, 'Downvoted'))
This has the limitation of only allowing one to skip a single value; a CASE, while more verbose, would let you use an expression.
For the record, the use I found was to suppress the value of a "most recent change" column if it was equal to the first change:
SELECT Record, FirstChange, NULLIF(LatestChange, FirstChange) AS LatestChange
This was useful only in that it reduced visual clutter for human consumers.
I rather think that
NULLIF(A, B)
is syntactic sugar for
CASE WHEN A = B THEN NULL ELSE A END
But you are correct: it is mere syntactic sugar to aid the human reader.
I often use it where I need to avoid the Division by Zero exception:
SELECT
COALESCE(Expression1 / NULLIF(Expression2, 0), 0) AS Result
FROM …
Three years later, I found a material use for NULLIF: using NULLIF(Field, '') translates empty strings into NULL, for equivalence with Oracle's peculiar idea about what "NULL" represents.
NULLIF is handy when you're working with legacy data that contains a mixture of null values and empty strings.
Example:
SELECT(COALESCE(NULLIF(firstColumn, ''), secondColumn) FROM table WHERE this = that
SUM and COUNT have the behavior of turning nulls into zeros. I could see NULLIF being handy when you want to undo that behavior. If fact this came up in a recent answer I provided. If I had remembered NULLIF I probably would have written the following
SELECT student,
NULLIF(coursecount,0) as courseCount
FROM (SELECT cs.student,
COUNT(os.course) coursecount
FROM #CURRENTSCHOOL cs
LEFT JOIN #OTHERSCHOOLS os
ON cs.student = os.student
AND cs.school <> os.school
GROUP BY cs.student) t