I want to be able to ignore or prevent an INSERT from occurring if there is a JSON object with keys but no values from within a Postgres function.Here is a small, contrived example:
DROP TABLE IF EXISTS mytable;
create table mytable(
a text,
b text
);
CREATE OR REPLACE FUNCTION insert_to_table(
somedata JSONB
)
RETURNS VOID AS $$
BEGIN
insert into mytable (a, b) select a,b from jsonb_populate_recordset(null::mytable, $1::jsonb);
END;
$$ LANGUAGE plpgsql;
select insert_to_table('[{"a":1,"b":2},{"a":3,"b":4}, {"a":null, "b": null}, {"a": null, "b": "some data"}]');
This will insert 4 records, with the first row being 1,2 and the next row being 3,4. The third row is "", "", and the forth is "", some data.
In this scenario, rows 1,2, and 4 are valid. I want to ignore 3 and prevent it from being inserted.
I do not want a blank row, and my data/table will be much larger than what is listed (roughly 20 fields in the table, and 20 key/value pairs in the JSON).
Most likely I will need to loop over the array and pick out JSON object where ALL the keys are null and not just 1 or 2.
How can I do that?
In Postgres you can refer to a complete row using the name of the table (alias) in the query and compare that to NULL. A record is considered NULL if all columns are null. So you can do:
create or replace function insert_to_table(somedata jsonb)
returns void as $$
begin
insert into mytable (a, b)
select a, b
from jsonb_populate_recordset(null::mytable, somedata) as t
where not (t is null);
end;
$$ language plpgsql;
Note that where t is not null is something different then where not (t is null). This works regardless of the number of columns or their data types.
To visualize the logic. The following:
select a,b,
not (t is null) as "not (t is null)",
t is null as "t is null",
t is not null as "t is not null"
from jsonb_populate_recordset(null::mytable,
'[{"a":1,"b":2},
{"a":3,"b":4},
{"a":null, "b": null},
{"a": null, "b": "some data"}]'::jsonb) as t(a,b)
returns:
a | b | not (t is null) | t is null | t is not null
--+-----------+-----------------+-----------+--------------
1 | 2 | true | false | true
3 | 4 | true | false | true
| | false | true | false
| some data | true | false | false
Unrelated:
The cast $1::jsonb is useless as you have declared the parameter of that type already.
Related
Assuming I have a table with JSONB data:
create table x (
id integer primary key generated always as identity,
name text,
data jsonb
);
Assuming data can have nested data, I would like to display all data inside data to have this kind of result:
id name data.a data.b.0 data.b.1 data.c
1 test 1 foo bar baz
2 test2 789 pim pam boom
Is there a way to do this without specifying all the JSONB properties names?
JSONB_TO_RECORDSET() function might be used within such a Select statement
SELECT a AS "data.a",
(b::JSONB) ->> 0 AS "data.b.0", (b::JSONB) ->> 1 AS "data.b.1",
c AS "data.c"
FROM x,
JSONB_TO_RECORDSET(data) AS j(a INT, b TEXT, c TEXT)
ORDER BY id
Presuming you have such JSONB values in the data column
[ { "a": 1, "b": ["foo","bar"], "c": "baz" }]
[ { "a": 789, "b": ["pim","pam"], "c": "boom" }]
Demo
Suppose I have a PostgreSQL table t that looks like
id | name | y
----+------+---
0 | 'a' | 0
1 | 'b' | 0
2 | 'c' | 0
3 | 'd' | 1
4 | 'e' | 2
5 | 'f' | 2
With id being the primary key and with a UNIQUE constraint on (name, y).
Suppose I want to update this table in such a way that the part of the data set with y = 0 becomes (without knowing what is already there)
id | name | y
----+------+---
0 | 'a' | 0
1 | 'x' | 0
2 | 'y' | 0
I could use
DELETE FROM t WHERE y = 0 AND name NOT IN ('a', 'x', 'y');
INSERT INTO t (name, y) VALUES ('a', 0), ('x', 0), ('y', 0)
ON CONFLICT (name) DO NOTHING;
I feel like there must be a one-statement way to do this (like what upsert does for the task "update the existing entries and insert missing ones", but then for "insert the missing entries and delete the entries that should not be there"). Is there? I heard rumours that oracle has something called MERGE... I'm not sure what it does exactly.
This can be done with a single statement. But I doubt whether that classifies as "simpler".
Additionally: your expected output doesn't make sense.
Your insert statement does not provide a value for the primary key column (id), so apparently, the id column is a generated (identity/serial) column.
But in that case, news rows can't have the same IDs as the ones before because when the new rows were inserted, new IDs were generated.
Given the above change to your expected output, the following does what you want:
with data (name, y) as (
values ('a', 0), ('x', 0), ('y', 0)
), changed as (
insert into t (name, y)
select *
from data
on conflict (name,y) do nothing
)
delete from t
where (name, y) not in (select name, y from data);
That is one statement, but certainly not "simpler". The only advantage I can see is that you do not have to specify the list of values twice.
Online example: https://rextester.com/KKB30299
Unless there's a tremendous number of rows to be updated, do it as three update statements.
update t set name = 'a' where id = 0;
update t set name = 'x' where id = 1;
update t set name = 'y' where id = 2;
This is simple. It's easily done in a loop with a SQL builder. There's no race conditions as there are with deleting and inserting. And it preserves the ids and other columns of those rows.
To demonstrate with some psuedo-Ruby code.
new_names = ['a', 'x', 'y']
# In a transaction
db.transaction {
# Query the matching IDs in the same order as their new names
ids_to_update = db.select("
select id from t where y = 0 order by id
")
# Iterate through the IDs and new names together
ids_to_update.zip(new_names).each { |id,name|
# Update the row with its new name
db.execute("
update t set name = ? where id = ?
", name, id)
}
}
Fooling around some, here's how I did it in "one" statement, or at least one thing sent to the server, while preserving the IDs and no race conditions.
do $$
declare change text[];
declare changes text[][];
begin
select array_agg(array[id::text,name])
into changes
from unnest(
(select array_agg(id order by id) from t where y = 0),
array['a','y','z']
) with ordinality as a(id, name);
foreach change slice 1 in array changes
loop
update t set name = change[2] where id = change[1]::int;
end loop;
end$$;
The goal is to produce an array of arrays matching the id to its new name. That can be iterated over to do the updates.
unnest(
(select array_agg(id order by id) from t where y = 0),
array['a','y','z']
) with ordinality as a(id, name);
That bit produces rows with the IDs and their new names side by side.
select array_agg(array[id::text,name])
into changes
from unnest(...) with ordinality as a(id, name);
Then those rows of IDs and names are turned into an array of arrays like: {{1,a},{2,y},{3,z}}. (There's probably a more direct way to do that)
foreach change slice 1 in array changes
loop
update t set name = change[2] where id = change[1]::int;
end loop;
Finally we loop over the array and use it to perform each update.
You can turn this into a proper function and pass in the y value to match and the array of names to change them to. You should verify that the length of the ids and names match.
This might be faster, depends on how many rows you're updating, but it isn't simpler, and it took some time to puzzle out.
I'm trying to implement EAV pattern using Attribute->Value tables but unlike standard way values stored in jsonb filed like {"attrId":[values]}. It's help make easy search request like:
SELECT * FROM products p WHERE p.attributes #> "{1:[2]} AND p.attributes #> "{1:[4]}"
Now I'm wondering is it will be a good approach, and what is a effective way to calculate count of available variations, for example:
-p1- {"width":[1]}
-p2- {"width":[2],"height":[3]}
-p3- {"width":[1]}
Output will
width: 1 (count 2); 2 (count 1)
height: 3 (count 1)
when select width 2
width: 1 (count 0); 2 (count 1)
height: 3 (count 1)
"Flat is better than nested" -- the zen of python
I think you would be better served to use simple key/value pairs and in the rare event you have a complex value, then make it a list. But I don't see that use case.
Here is an example which answers your question. It could be modified to use your structure, but let's keep it simple:
First create a table and insert some JSON:
# create table foo (a jsonb);
# insert into foo values ('{"a":"1", "b":"2"}');
# insert into foo values ('{"c":"3", "d":"4"}');
# insert into foo values ('{"e":"5", "a":"6"}');
Here are the records:
# select * from foo;
a
----------------------
{"a": "1", "b": "2"}
{"c": "3", "d": "4"}
{"a": "6", "e": "5"}
(3 rows)
Here is the output of the json_each_text() function from https://www.postgresql.org/docs/9.6/static/functions-json.html
# select jsonb_each_text(a) from foo;
jsonb_each_text
-----------------
(a,1)
(b,2)
(c,3)
(d,4)
(a,6)
(e,5)
(6 rows)
Now we need to put it in a table expression to be able to get access to the individual fields:
# with t1 as (select jsonb_each_text(a) as rec from foo)
select (rec).key, (rec).value from t1;
key | value
-----+-------
a | 1
b | 2
c | 3
d | 4
a | 6
e | 5
(6 rows)
And lastly here is a grouping with the SUM function. Notice the a key which was in the database 2x, has been properly summed.
# with t1 as (select jsonb_each_text(a) as rec from foo)
select (rec).key, sum((rec).value::int) from t1 group by (rec).key;
key | sum
-----+-----
c | 3
b | 2
a | 7
e | 5
d | 4
(5 rows)
As a final note, (rec) has parentheses around it because otherwise it is incorrectly looked at as a table and will result in this error:
ERROR: missing FROM-clause entry for table "rec"
I have a column with type jsonb holding a list of IDs as plain JSON array in my PostgreSQL 9.6.6 database and I want to search this field based on any ID in the list. How to query write this query?
'[1,8,3,4,56,6]'
For example, my table is:
CREATE TABLE mytable (
id bigint NOT NULL,
numbers jsonb
);
And it has some values:
id | numbers
-----+-------
1 | "[1,8,3,4,56,6]"
2 | "[1,2,7,4,24,5]"
I want something like this:
SELECT *
FROM mytable
WHERE
id = 1
AND
numbers::json->>VALUE(56)
;
Expected result (only if the JSON array has 56 as element):
id | numbers
-----+-------
1 | "[1,8,3,4,56,6]"
Step-2 problem :
The result of this command is TRUE :
SELECT '[1,8,3,4,56,6]'::jsonb #> '56';
but already when I use
SELECT *
FROM mytable
numbers::jsonb #> '[56]';
or
SELECT *
FROM mytable
numbers::jsonb #> '56';
or
SELECT *
FROM mytable
numbers::jsonb #> '[56]'::jsonb;
The result is nothing :
id | numbers
-----+-------
(0 rows)
Instead of be this :
id | numbers
-----+-------
1 | "[1,8,3,4,56,6]"
I find why I get (0 rows) ! :))
because I insert jsonb value to mytable with double quotation , in fact this is correct value format (without double quotation ):
id | numbers
-----+-------
1 | [1,8,3,4,56,6]
2 | [1,2,7,4,24,5]
now when run this command:
SELECT *
FROM mytable
numbers #> '56';
The result is :
id | numbers
-----+-------
1 | [1,8,3,4,56,6]
Use the jsonb "contains" operator #>:
SELECT *
FROM mytable
WHERE id = 1
AND numbers #> '[56]';
Or
...
AND numbers #> '56';
Works with our without enclosing array brackets in this case.
dbfiddle here
This can be supported with various kinds of indexes for great read performance if your table is big.
Detailed explanation / instructions:
Index for finding an element in a JSON array
Hint (addressing your comment): when testing with string literals, be sure to add an explicit cast:
SELECT '[1,8,3,4,56,6]'::jsonb #> '56';
If you don't, Postgres does not know which data types to assume. There are multiple options:
SELECT '[1,8,3,4,56,6]' #> '56';
ERROR: operator is not unique: unknown #> unknown
Related:
GIN index on smallint[] column not used or error "operator is not unique"
I need to write a query (or function) that will update existing records in a table from with values stored in an hstore column in another table. For example:
create temp table foo(id int primary key, f1 int, f2 text, f3 int);
insert into foo values
(1, 1, 'jack', 1),
(2, 2, 'ted' , 2),
(3, 3, 'fred', 3);
create temp table bar(foo_id int references foo(id), row_data hstore);
insert into bar values
(1, 'f1=>0, f2=>bill'::hstore),
(2, 'f1=>0, f2=>will, f3=>0'::hstore),
(3, 'f3=>0'::hstore);
Only columns that have values in the hstore column should get updated, so after processing, the desired result would be:
select * from foo;
+----+----+------+----+
| id | f1 | f2 | f3 |
+----+----+------+----+
| 1 | 0 | bill | 1 |
| 2 | 0 | will | 0 |
| 3 | 3 | fred | 0 |
+----+----+------+----+
What is the "best" way to update foo with the values in bar?
Note: I'm defining best as being the easiest to code. While performance is always important, this is a batch job and the speed is not as critical as it might be if a user was waiting on the results.
I'm using PostgreSQL 9.4.
To retain original column values if nothing is supplied in the hstore column ...
Simple method with COALESCE
UPDATE foo f
SET f1 = COALESCE((b.row_data->'f1')::int, f1)
, f2 = COALESCE( b.row_data->'f2' , f2)
, f3 = COALESCE((b.row_data->'f3')::int, f3)
FROM bar b
WHERE f.id = b.foo_id
AND b.row_data ?| '{f1,f2,f3}'::text[];
The added last line excludes unaffected rows from the UPDATE right away: the ?| operator checks (per documentation):
does hstore contain any of the specified keys?
If that's not the case it's cheapest not to touch the row at all.
Else, at least one (but not necessarily all!) of the columns receives an UPDATE. That's where COALESCE comes in.
However, per documentation:
A value (but not a key) can be an SQL NULL.
So COALESCE cannot distinguish between two possible meanings of NULL here:
The key 'f2'` was not found.
b.row_data->'f2' returns NULL as new value for f2.
Works for NULL values, too
UPDATE foo f
SET f1 = CASE WHEN b.row_data ? 'f1'
THEN (b.row_data->'f1')::int ELSE f1 END
, f2 = CASE WHEN b.row_data ? 'f2'
THEN b.row_data->'f2' ELSE f2 END
, f3 = CASE WHEN b.row_data ? 'f3'
THEN (b.row_data->'f3')::int ELSE f3 END
FROM bar b
WHERE f.id = b.foo_id
AND b.row_data ?| '{f1,f2,f3}'::text[];
The ? operator checks for a single key:
does hstore contain key?
So you're after a simple update? As f1 and f3 are integers you need to cast those. Otherwise it's just:
UPDATE foo SET f1 = (row_data->'f1')::integer,
f2 = row_data->'f2',
f3 = (row_data->'f3')::integer
FROM bar WHERE foo.id = foo_id;