How to query a json column for empty objects? - sql

Looking to find all rows where a certain json column contains an empty object, {}. This is possible with JSON arrays, or if I am looking for a specific key in the object. But I just want to know if the object is empty. Can't seem to find an operator that will do this.
dev=# \d test
Table "public.test"
Column | Type | Modifiers
--------+------+-----------
foo | json |
dev=# select * from test;
foo
---------
{"a":1}
{"b":1}
{}
(3 rows)
dev=# select * from test where foo != '{}';
ERROR: operator does not exist: json <> unknown
LINE 1: select * from test where foo != '{}';
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
dev=# select * from test where foo != to_json('{}'::text);
ERROR: operator does not exist: json <> json
LINE 1: select * from test where foo != to_json('{}'::text);
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
dwv=# select * from test where foo != '{}'::json;
ERROR: operator does not exist: json <> json
LINE 1: select * from test where foo != '{}'::json;
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.

There is no equality (or inequality) operator for the data type json as a whole, because equality is hard to establish. Consider jsonb in Postgres 9.4 or later, where this is possible. More details in this related answer on dba.SE (last chapter):
How to remove known elements from a JSON[] array in PostgreSQL?
SELECT DISTINCT json_column ... or ... GROUP BY json_column fail for the same reason (no equality operator).
Casting both sides of the expression to text allows = or <> operators, but that's not normally reliable as there are many possible text representations for the same JSON value. In Postgres 9.4 or later, cast to jsonb instead. (Or use jsonb to begin with.)
However, for this particular case (empty object) it works just fine:
select * from test where foo::text <> '{}'::text;

Empty JSON array [] could also be relevant.
Then this could work for both [] and {}:
select * from test where length(foo::text) > 2 ;

You have to be careful. Casting all your data as a different type so you can compare it will have performance issues on a large database.
If your data has a consistent key then you can look for the existence of the key. For example if plan data is {} or {id: '1'}
then you can look for items without 'id'
SELECT * FROM public."user"
where NOT(plan ? 'id')

As of PostgreSQL 9.5 this type of query with JSON data is not possible. On the other hand, I agree it would be very useful and created a request for it:
https://postgresql.uservoice.com/forums/21853-general/suggestions/12305481-check-if-json-is-empty
Feel free to vote it, and hopefully it will be implemented!

In 9.3 it is possible to count the pairs in each object and filter the ones with none
create table test (foo json);
insert into test (foo) values
('{"a":1, "c":2}'), ('{"b":1}'), ('{}');
select *
from test
where (select count(*) from json_each(foo) s) = 0;
foo
-----
{}
or test the existence, probably faster for big objects
select *
from test
where not exists (select 1 from json_each(foo) s);
Both techniques will work flawlessly regardless of formating

According to the JSON Functions and Operators documentation you can use the double arrow function (->>) to get a json object or array field as text. Then do an equality check against a string.
So this worked for me:
SELECT jsonb_col from my_table
WHERE jsonb_col ->> 'key' = '{}';
Or if it's nested more than one level use the path function (#>>)
SELECT jsonb_col from my_table
WHERE jsonb_col #>> '{key, nestedKey}' = '{}';
Currently supported version as of this writing:
Supported Versions: Current (13) / 12 / 11 / 10 / 9.6

Related

How to cast postgres JSON column to int without key being present in JSON (simple JSON values)?

I am working on data in postgresql as in the following mytable with the fields id (type int) and val (type json):
id
val
1
"null"
2
"0"
3
"2"
The values in the json column val are simple JSON values, i.e. just strings with surrounding quotes and have no key.
I have looked at the SO post How to convert postgres json to integer and attempted something like the solution presented there
SELECT (mytable.val->>'key')::int FROM mytable;
but in my case, I do not have a key to address the field and leaving it empty does not work:
SELECT (mytable.val->>'')::int as val_int FROM mytable;
This returns NULL for all rows.
The best I have come up with is the following (casting to varchar first, trimming the quotes, filtering out the string "null" and then casting to int):
SELECT id, nullif(trim('"' from mytable.val::varchar), 'null')::int as val_int FROM mytable;
which works, but surely cannot be the best way to do it, right?
Here is a db<>fiddle with the example table and the statements above.
Found the way to do it:
You can access the content via the keypath (see e.g. this PostgreSQL JSON cheatsheet):
Using the # operator, you can access the json fields through the keypath. Specifying an empty keypath like this {} allows you to get your content without a key.
Using double angle brackets >> in the accessor will return the content without the quotes, so there is no need for the trim() function.
Overall, the statement
select id
, nullif(val#>>'{}', 'null')::int as val_int
from mytable
;
will return the contents of the former json column as int, respectvely NULL (in postgresql >= 9.4):
id
val_int
1
NULL
2
0
3
2
See updated db<>fiddle here.
--
Note: As pointed out by #Mike in his comment above, if the column format is jsonb, you can also use val->>0 to dereference scalars. However, if the format is json, the ->> operator will yield null as result. See this db<>fiddle.

Postgres Array[VarChar] uppercase?

I'm trying to find a way to force an array to make it upper or lowercase. This is so that no matter what the user inputs they get a result. This is the query:
select * from table where any(:id) = databasecolumn
:id is an array of chars that the user inputs(can be lowercase or uppercase) and I need to make sure that whatever the user inputs they get a result.
This works as long as the user inputs in uppercase (because the database values are also uppercase). But when they input lowercase letters they get no response.
I tried this:
select * from table where any(upper(:id)) = upper(databasecolumn)
but this does not work because the function "upper" is not for arrays. It works fine when I do it with a single input but not arrays.
Do you have any pointers? I couldn't find an equivalent function for an array of varchars.
You could use ILIKE:
select *
from table
where databasecolumn ILIKE any(:id);
This:
with data (col) as (
values ('one'), ('Two'), ('THREE')
)
select *
from data
where col ilike any(array['one', 'two', 'three']);
returns:
col
-----
one
Two
THREE
you can use double casting like here:
t=# with a as (select '{caSe1,cAse2}'::text[] r) select r,upper(r::text)::text[] from a where true;
r | upper
---------------+---------------
{caSe1,cAse2} | {CASE1,CASE2}
(1 row)
It neglects the benefits of using ANY though

How to transfer a column in an array using PostgreSQL, when the columns data type is a composite type?

I'm using PostgreSQL 9.4 and I'm currently trying to transfer a columns values in an array. For "normal" (not user defined) data types I get it to work.
To explain my problem in detail, I made up a minimal example.
Let's assume we define a composite type "compo" and create a table "test_rel" and insert some values. Looks like this and works for me:
CREATE TYPE compo AS(a int, b int);
CREATE TABLE test_rel(t1 compo[],t2 int);
INSERT INTO test_rel VALUES('{"(1,2)"}',3);
INSERT INTO test_rel VALUES('{"(4,5)","(6,7)"}',3);
Next, we try to get an array with column t2's values. The following also works:
SELECT array(SELECT t2 FROM test_rel WHERE t2='3');
Now, we try to do the same stuff with column t1 (the column with the composite type). My problem is now, that the following does'nt work:
SELECT array(SELECT t1 FROM test_rel WHERE t2='3');
ERROR: could not find array type for data type compo[]
Could someone please give me a hint, why the same statement does'nt work with the composite type? I'm not only new to stackoverflow, but also to PostgreSQL and plpgsql. So, please tell me, when I'm doing something the wrong way.
There were some discussion about this in the PostgreSQL mailing list.
Long story short, both
select array(select array_type from ...)
select array_agg(array_type) from ...
represents a concept of array of arrays, which PostgreSQL doesn't support. PostgreSQL supports multidimensional arrays, but they have to be rectangular. F.ex. ARRAY[[0,1],[2,3]] is valid, but ARRAY[[0],[1,2]] is not.
There were some improvement with both the array constructor & the array_agg() function in 9.5.
Now, they explicitly states, that they will accumulate array arguments as a multidimensional array, but only if all of its parts have equal dimensions.
array() constructor: If the subquery's output column is of an array type, the result will be an array of the same type but one higher dimension; in this case all the subquery rows must yield arrays of identical dimensionality, else the result would not be rectangular.
array_agg(any array type): input arrays concatenated into array of one higher dimension (inputs must all have same dimensionality, and cannot be empty or NULL)
For 9.4, you could wrap the array into a row: this way, you could create something, which is almost an array of arrays:
SELECT array(SELECT ROW(t1) FROM test_rel WHERE t2='3');
SELECT array_agg(ROW(t1)) FROM test_rel WHERE t2='3';
Or, you could use a recursive CTE (and an array concatenation) to workaround the problem, like:
with recursive inp(arr) as (
values (array[0,1]), (array[1,2]), (array[2,3])
),
idx(arr, idx) as (
select arr, row_number() over ()
from inp
),
agg(arr, idx) as (
select array[[0, 0]] || arr, idx
from idx
where idx = 1
union all
select agg.arr || idx.arr, idx.idx
from agg
join idx on idx.idx = agg.idx + 1
)
select arr[array_lower(arr, 1) + 1 : array_upper(arr, 1)]
from agg
order by idx desc
limit 1;
But of course this solution is highly dependent of your data ('s dimensions).

How to use ANY instead of IN in a WHERE clause?

I used to have a query like in Rails:
MyModel.where(id: ids)
Which generates sql query like:
SELECT "my_models".* FROM "my_models"
WHERE "my_models"."id" IN (1, 28, 7, 8, 12)
Now I want to change this to use ANY instead of IN. I created this:
MyModel.where("id = ANY(VALUES(#{ids.join '),('}))"
Now when I use empty array ids = [] I get the folowing error:
MyModel Load (53.0ms) SELECT "my_models".* FROM "my_models" WHERE (id = ANY(VALUES()))
ActiveRecord::JDBCError: org.postgresql.util.PSQLException: ERROR: syntax error at or near ")"
ActiveRecord::StatementInvalid: ActiveRecord::JDBCError: org.postgresql.util.PSQLException: ERROR: syntax error at or near ")"
Position: 75: SELECT "social_messages".* FROM "social_messages" WHERE (id = ANY(VALUES()))
from arjdbc/jdbc/RubyJdbcConnection.java:838:in `execute_query'
There are two variants of IN expressions:
expression IN (subquery)
expression IN (value [, ...])
Similarly, two variants with the ANY construct:
expression operator ANY (subquery)
expression operator ANY (array expression)
A subquery works for either technique, but for the second form of each, IN expects a list of values (as defined in standard SQL) while = ANY expects an array.
Which to use?
ANY is a later, more versatile addition, it can be combined with any binary operator returning a boolean value. IN burns down to a special case of ANY. In fact, its second form is rewritten internally:
IN is rewritten with = ANY
NOT IN is rewritten with <> ALL
Check the EXPLAIN output for any query to see for yourself. This proves two things:
IN can never be faster than = ANY.
= ANY is not going to be substantially faster.
The choice should be decided by what's easier to provide: a list of values or an array (possibly as array literal - a single value).
If the IDs you are going to pass come from within the DB anyway, it is much more efficient to select them directly (subquery) or integrate the source table into the query with a JOIN (like #mu commented).
To pass a long list of values from your client and get the best performance, use an array, unnest() and join, or provide it as table expression using VALUES (like #PinnyM commented). But note that a JOIN preserves possible duplicates in the provided array / set while IN or = ANY do not. More:
Optimizing a Postgres query with a large IN
In the presence of NULL values, NOT IN is often the wrong choice and NOT EXISTS would be right (and faster, too):
Select rows which are not present in other table
Syntax for = ANY
For the array expression Postgres accepts:
an array constructor (array is constructed from a list of values on the Postgres side) of the form: ARRAY[1,2,3]
or an array literal of the form '{1,2,3}'.
To avoid invalid type casts, you can cast explicitly:
ARRAY[1,2,3]::numeric[]
'{1,2,3}'::bigint[]
Related:
PostgreSQL: Issue with passing array to procedure
How to pass custom type array to Postgres function
Or you could create a Postgres function taking a VARIADIC parameter, which takes individual arguments and forms an array from them:
Passing multiple values in single parameter
How to pass the array from Ruby?
Assuming id to be integer:
MyModel.where('id = ANY(ARRAY[?]::int[])', ids.map { |i| i})
But I am just dabbling in Ruby. #mu provides detailed instructions in this related answer:
Sending array of values to a sql query in ruby?

Check if value exists in Postgres array

Using Postgres 9.0, I need a way to test if a value exists in a given array. So far I came up with something like this:
select '{1,2,3}'::int[] #> (ARRAY[]::int[] || value_variable::int)
But I keep thinking there should be a simpler way to this, I just can't see it. This seems better:
select '{1,2,3}'::int[] #> ARRAY[value_variable::int]
I believe it will suffice. But if you have other ways to do it, please share!
Simpler with the ANY construct:
SELECT value_variable = ANY ('{1,2,3}'::int[])
The right operand of ANY (between parentheses) can either be a set (result of a subquery, for instance) or an array. There are several ways to use it:
SQLAlchemy: how to filter on PgArray column types?
IN vs ANY operator in PostgreSQL
Important difference: Array operators (<#, #>, && et al.) expect array types as operands and support GIN or GiST indices in the standard distribution of PostgreSQL, while the ANY construct expects an element type as left operand and can be supported with a plain B-tree index (with the indexed expression to the left of the operator, not the other way round like it seems to be in your example). Example:
Index for finding an element in a JSON array
None of this works for NULL elements. To test for NULL:
Check if NULL exists in Postgres array
Watch out for the trap I got into: When checking if certain value is not present in an array, you shouldn't do:
SELECT value_variable != ANY('{1,2,3}'::int[])
but use
SELECT value_variable != ALL('{1,2,3}'::int[])
instead.
but if you have other ways to do it please share.
You can compare two arrays. If any of the values in the left array overlap the values in the right array, then it returns true. It's kind of hackish, but it works.
SELECT '{1}' && '{1,2,3}'::int[]; -- true
SELECT '{1,4}' && '{1,2,3}'::int[]; -- true
SELECT '{4}' && '{1,2,3}'::int[]; -- false
In the first and second query, value 1 is in the right array
Notice that the second query is true, even though the value 4 is not contained in the right array
For the third query, no values in the left array (i.e., 4) are in the right array, so it returns false
unnest can be used as well.
It expands array to a set of rows and then simply checking a value exists or not is as simple as using IN or NOT IN.
e.g.
id => uuid
exception_list_ids => uuid[]
select * from table where id NOT IN (select unnest(exception_list_ids) from table2)
Hi that one works fine for me, maybe useful for someone
select * from your_table where array_column ::text ilike ANY (ARRAY['%text_to_search%'::text]);
"Any" works well. Just make sure that the any keyword is on the right side of the equal to sign i.e. is present after the equal to sign.
Below statement will throw error: ERROR: syntax error at or near "any"
select 1 where any('{hello}'::text[]) = 'hello';
Whereas below example works fine
select 1 where 'hello' = any('{hello}'::text[]);
When looking for the existence of a element in an array, proper casting is required to pass the SQL parser of postgres. Here is one example query using array contains operator in the join clause:
For simplicity I only list the relevant part:
table1 other_name text[]; -- is an array of text
The join part of SQL shown
from table1 t1 join table2 t2 on t1.other_name::text[] #> ARRAY[t2.panel::text]
The following also works
on t2.panel = ANY(t1.other_name)
I am just guessing that the extra casting is required because the parse does not have to fetch the table definition to figure the exact type of the column. Others please comment on this.