How can I use arrayExists function when the array contains a null value? - sql

I have a nullable array column in my table: Array(Nullable(UInt16)). I want to be able to query this column using arrayExists (or arrayAll) to check if it contains a value above a certain threshold but I'm getting an exception when the array contains a null value:
Exception: Expression for function arrayExists must return UInt8, found Nullable(UInt8)
My query is below where distance is the array column:
SELECT * from TracabEvents_ArrayTest
where arrayExists(x -> x > 9, distance);
I've tried updating the comparison in the lambda to "(isNotNull(x) and x > 9)" but I'm still getting the error. Is there any way of handling nulls in these expressions or are they not supported yet?

Add a condition to filter rows with empty list using notEmpty and assumeNotNull for x in arrayExists.
SELECT * FROM TracabEvents_ArrayTest WHERE notEmpty(distance) AND arrayExists(x -> assumeNotNull(x) > 9, distance)

Related

How to fetch all rows where an array contains any of the fields array elements

I have a table that has a column video_ids, it is of a bigint[] type. I would like to find all the rows that have any of the elements from the array passed in a select statement. So, if I have a row that has a video_ids field that looks like this:
{9529387, 9548200, 9579636}
I would like to fetch it if I pass an array that has any of this video_ids. I thought I would do that with any, but I am not sure how to do this in SQL, I have tried with this:
select id, finished, failed, video_ids, invoiced_video_ids, failed_video_ids
from video_order_execution
where order_ids = any(
'{9548200, 11934626, 9579636, 11936321, 11509698, 11552728, 11592106, 11643565, 11707543, 11810386, 11846268}'
::bigint[]);
I get an error if I do that:
ERROR: operator does not exist: bigint[] = bigint Hint: No operator
matches the given name and argument types. You might need to add
explicit type casts.
How can I make such a statement that would do the job for what I need?
Use the operator && which returns true if the 2 operands have any common items:
select id, finished, failed, video_ids, invoiced_video_ids, failed_video_ids
from video_order_execution
where order_ids &&
'{9548200, 11934626, 9579636, 11936321, 11509698, 11552728, 11592106, 11643565, 11707543, 11810386, 11846268}'::bigint[];

How to set outgoing nodes in Cypher?

I have a simple query: given a node A, count the number of nodes that node A has outgoing links, then set that number as a property of A. However I just can't do it.
Attempt 1
MATCH (n)-->(m)
SET n.out=count(m)
RETURN n.name,n.out
This yields the error:
Invalid use of aggregating function count(...) in this context (line 2, column 11 (offset: 28))
"set n.out=count(m)"
^
Attempt 2
MATCH (n)-->(m)
WITH count(m) AS o
SET n.out=o
RETURN n.name,n.out
This yields the error:
Variable `n` not defined (line 3, column 5 (offset: 43))
"SET n.out=o"
^
In both times the errors are in the SET clause. But reading the documentation for SET I cannot identify why these happen.
I cannot count the links because for one pair of n, m there maybe several link types.
You can achieve the desired outcome with the following query:
MATCH (n)
WITH size((n)-->()) as out, n
SET n.out = out
RETURN n.name, n.out

Order array with rows

I try to sort array of rows using the array_sort function, but not result as I wanted.
the first result order correctly
[{type=CALM, confidence=95.1536636352539}, {type=CONFUSED, confidence=1.1397864818572998}, {type=HAPPY, confidence=0.07988717406988144}, {type=SAD, confidence=1.7613277435302734}, {type=SURPRISED, confidence=0.3601384460926056}]
the second line return another result
[{type=CALM, confidence=0.5053133368492126}, {type=CONFUSED, confidence=0.4852835536003113}, {type=HAPPY, confidence=92.1430892944336}, {type=SAD, confidence=1.6924850940704346}, {type=SURPRISED, confidence=3.10842227935791}]
The result expected to second line is
[
{type=HAPPY, confidence=92.1430892944336},
{type=SURPRISED, confidence=3.10842227935791},
{type=SAD, confidence=1.6924850940704346},
{type=CALM, confidence=0.5053133368492126},
{type=CONFUSED, confidence=0.4852835536003113}]
It's possible sort by confidence?
In newer Presto version, you can use array_sort function that takes a lambda function (like sorting in Java with a comparator):
SELECT array_sort(array_or_rows,
(a, b) -> IF(a[2] < b[2], 1, IF(a[2] = b[2], 0, -1))
...
However, Athena is based in Presto 0.172, so this array_sort variant is not available and you need to do something like this instead:
swap sides of your ROW type so that confidence is first
sort array
swap ROWs back
E.g.:
SELECT
transform(
array_sort(
transform(
array_or_rows,
r -> CAST(r AS ROW(confidence double, type varchar)))),
r -> CAST(r AS ROW(type varchar, confidence double)))
....

Passing Optional List argument from Django to filter with in Raw SQL

When using primitive types such as Integer, I can without any problems do a query like this:
with connection.cursor() as cursor:
cursor.execute(sql='''SELECT count(*) FROM account
WHERE %(pk)s ISNULL OR id %(pk)s''', params={'pk': 1})
Which would either return row with id = 1 or it would return all rows if pk parameter was equal to None.
However, when trying to use similar approach to pass a list/tuple of IDs, I always produce a SQL syntax error when passing empty/None tuple, e.g. trying:
with connection.cursor() as cursor:
cursor.execute(sql='''SELECT count(*) FROM account
WHERE %(ids)s ISNULL OR id IN %(ids)s''', params={'ids': (1,2,3)})
works, but passing () produces SQL syntax error:
psycopg2.ProgrammingError: syntax error at or near ")"
LINE 1: SELECT count(*) FROM account WHERE () ISNULL OR id IN ()
Or if I pass None I get:
django.db.utils.ProgrammingError: syntax error at or near "NULL"
LINE 1: ...LECT count(*) FROM account WHERE NULL ISNULL OR id IN NULL
I tried putting the argument in SQL in () - (%(ids)s) - but that always breaks one or the other condition. I also tried playing around with pg_typeof or casting the argument, but with no results.
Notes:
the actual SQL is much more complex, this one here is a simplification for illustrative purposes
as a last resort - I could alter the SQL in Python based on the argument, but I really wanted to avoid that.)
At first I had an idea of using just 1 argument, but replacing it with a dummy value [-1] and then using it like
cursor.execute(sql='''SELECT ... WHERE -1 = any(%(ids)s) OR id = ANY(%(ids)s)''', params={'ids': ids if ids else [-1]})
but this did a Full table scan for non empty lists, which was unfortunate, so a no go.
Then I thought I could do a little preprocessing in python and send 2 arguments instead of just the single list- the actual list and an empty list boolean indicator. That is
cursor.execute(sql='''SELECT ... WHERE %(empty_ids)s = TRUE OR id = ANY(%(ids)s)''', params={'empty_ids': not ids, 'ids': ids})
Not the most elegant solution, but it performs quite well (Index scan for non empty list, Full table scan for empty list - but that returns the whole table anyway, so it's ok)
And finally I came up with the simplest solution and quite elegant:
cursor.execute(sql='''SELECT ... WHERE '{}' = %(ids)s OR id = ANY(%(ids)s)''', params={'ids': ids})
This one also performs Index scan for non empty lists, so it's quite fast.
From the psycopg2 docs:
Note You can use a Python list as the argument of the IN operator using the PostgreSQL ANY operator.
ids = [10, 20, 30]
cur.execute("SELECT * FROM data WHERE id = ANY(%s);", (ids,))
Furthermore ANY can also work with empty lists, whereas IN () is a SQL syntax error.

Rails 4 querying against postgresql column with array data type error

I am trying to query a table with a column with the postgresql array data type in Rails 4.
Here is the table schema:
create_table "db_of_exercises", force: true do |t|
t.text "preparation"
t.text "execution"
t.string "category"
t.datetime "created_at"
t.datetime "updated_at"
t.string "name"
t.string "body_part", default: [], array: true
t.hstore "muscle_groups"
t.string "equipment_type", default: [], array: true
end
The following query works:
SELECT * FROM db_of_exercises WHERE ('Arms') = ANY (body_part);
However, this query does not:
SELECT * FROM db_of_exercises WHERE ('Arms', 'Chest') = ANY (body_part);
It throws this error:
ERROR: operator does not exist: record = character varying
This does not work for me either:
SELECT * FROM "db_of_exercises" WHERE "body_part" IN ('Arms', 'Chest');
That throws this error:
ERROR: array value must start with "{" or dimension information
So, how do I query a column with an array data type in ActiveRecord??
What I have right now is:
#exercises = DbOfExercise.where(body_part: params[:body_parts])
I want to be able to query records that have more than one body_part associated with them, which was the whole point of using an array data type, so if someone could enlighten me on how to do this that would be awesome. I don't see it anywhere in the docs.
Final solution for posterity:
Using the overlap operator (&&):
SELECT * FROM db_of_exercises WHERE ARRAY['Arms', 'Chest'] && body_part;
I was getting this error:
ERROR: operator does not exist: text[] && character varying[]
so I typecasted ARRAY['Arms', 'Chest'] to varchar:
SELECT * FROM db_of_exercises WHERE ARRAY['Arms', 'Chest']::varchar[] && body_part;
and that worked.
I don't think that it has related to rails.
What if you do the follow?
SELECT * FROM db_of_exercises WHERE 'Arms' = ANY (body_part) OR 'Chest' = ANY (body_part)
I know that rails 4 supports Postgresql ARRAY datatype, but I'm not sure if ActiveRecord creates new methods for query the datatype. Maybe you can use Array Overlap I mean the && operator and then doind something like:
WHERE ARRAY['Arms', 'Chest'] && body_part
or maybe give a look to this gem: https://github.com/dockyard/postgres_ext/blob/master/docs/querying.md
And then do a query like:
DBOfExercise.where.overlap(:body_part => params[:body_parts])
#Aguardientico is absolutely right that what you want is the array overlaps operator &&. I'm following up with some more explanation, but would prefer you to accept that answer, not this one.
Anonymous rows (records)
The construct ('item1', 'item2', ...) is a row-constructor unless it appears in an IN (...) list. It creates an anonymous row, which PostgreSQL calls a "record". The error:
ERROR: operator does not exist: record = character varying
is because ('Arms', 'Chest') is being interpreted as if it were ROW('Arms', 'Chest'), which produces a single record value:
craig=> SELECT ('Arms', 'Chest'), ROW('Arms', 'Chest'), pg_typeof(('Arms', 'Chest'));
row | row | pg_typeof
--------------+--------------+-----------
(Arms,Chest) | (Arms,Chest) | record
(1 row)
and PostgreSQL has no idea how it's supposed to compare that to a string.
I don't really like this behaviour; I'd prefer it if PostgreSQL required you to explicitly use a ROW() constructor when you want an anonymous row. I expect that the behaviour shown here exists to support SET (col1,col2,col3) = (val1,val2,val3) and other similar operations where a ROW(...) constructor wouldn't make as much sense.
But the same thing with a single item works?
The reason the single ('Arms') case works is because unless there's a comma it's just a single parenthesised value where the parentheses are redundant and may be ignored:
craig=> SELECT ('Arms'), ROW('Arms'), pg_typeof(('Arms')), pg_typeof(ROW('Arms'));
?column? | row | pg_typeof | pg_typeof
----------+--------+-----------+-----------
Arms | (Arms) | unknown | record
(1 row)
Don't be alarmed by the type unknown. It just means that it's a string literal that hasn't yet had a type applied:
craig=> SELECT pg_typeof('blah');
pg_typeof
-----------
unknown
(1 row)
Compare array to scalar
This:
SELECT * FROM "db_of_exercises" WHERE "body_part" IN ('Arms', 'Chest');
fails with:
ERROR: array value must start with "{" or dimension information
because of implicit casting. The type of the body_part column is text[] (or varchar[]; same thing in PostgreSQL). You're comparing it for equality with the values in the IN clause, which are unknown-typed literals. The only valid equality operator for an array is = another array of the same type, so PostgreSQL figures that the values in the IN clause must also be arrays of text[] and tries to parse them as arrays.
Since they aren't written as array literals, like {"FirstValue","SecondValue"}, this parsing fails. Observe:
craig=> SELECT 'Arms'::text[];
ERROR: array value must start with "{" or dimension information
LINE 1: SELECT 'Arms'::text[];
^
See?
It's easier to understand this once you see that IN is actually just a shorthand for = ANY. It's an equality comparison to each element in the IN list. That isn't what you want if you really want to find out if two arrays overlap.
So that's why you want to use the array overlaps operator &&.