Ignore specific rows and merge operation using on conflict in Postgresql - sql

Table Structure:
create table example_test (a_id integer, b_id integer, c_id integer, flag integer);
Unique Constraint:
Alter table example_test
add constraint u_key unique(a_id, b_id, c_id);
My code:
with a_ins_upd as (
Insert into example (a_id, b_id, c_id, flag)
select x.a_id, x.b_id, x.c_id, x.flag
from <input_tableType> x
on conflict on constraint u_key
do update
set
a_id = excluded.a_id,
b_id = excluded.b_id,
c_id = excluded.c_id,
flag = excluded.flag
where flag = 0
)
Operations on Data:
I want to ignore the records with flag=1, and do the Upsert on the other records.

Basically I think you want a filtered unique key.
Instead of:
alter table example_test
add constraint u_key unique(a_id, b_id, c_id);
You could do:
create unique index example_idx on example_test(a_id, b_id, c_id) where flag = 0;
You can then use a regular insert ... on conflict clause (without the where clause in on conflict).

ok, Exclusion Constraint is not supported by on conflict clause. Makes sense, it can update multiple records. Only way is to handle programmatically.

Related

On conflict do nothing with a custom constraint

I need to do the following:
insert into table_a (primarykey_field, other_field)
select primarykey_field, other_field from table_b b
on conflict (primarykey_field) where primarykey_field >>= b.primarykey_field do nothing;
Nevermind the operation of my where condition it could be anything except a simple equal. in mycase I'm using a custom ip range field soI I want to check that one ip address is not in the range of the other ip adderss when I'm inserting a new row.
Is there a way I can do this with on conflict or with another query?
You can filter out all rows which have a pkey_ip_range that's already contained by an existing pkey_ip_range:
insert into table_a as a (
pkey_ip_range,
other_field)
select pkey_ip_range,
other_field
from table_b b
where not exists (
select 1
from table_a
where b.pkey_ip_range >>= table_a.pkey_ip_range);
If you wanted to check if the incoming ip range either contains or is contained by the existing ip range (&& rather than >>=), you can use an exclusion constraint:
drop table if exists table_a;
create table table_a (
pkey_ip_range inet primary key,
other_column text);
alter table table_a
add constraint table_a_no_contained_ip_ranges
exclude using gist (pkey_ip_range inet_ops WITH &&);
insert into table_a
(pkey_ip_range,other_column)
values ('192.168.0.0/31','abc');
insert into table_a
(pkey_ip_range,other_column)
values ('192.168.0.0/30','def');
--ERROR: conflicting key value violates exclusion constraint "table_a_no_contained_ip_ranges"
--DETAIL: Key (pkey_ip_range)=(192.168.0.0/30) conflicts with existing key (pkey_ip_range)=(192.168.0.0/31).
insert into table_a
(pkey_ip_range,other_column)
values ('192.168.0.0/32','ghi')
on conflict do nothing;
--table table_a;
-- pkey_ip_range | other_column
------------------+--------------
-- 192.168.0.0/31 | abc
--(1 row)

Check for uniqueness of column in postgres table

I need to ensure that the values in a column from a table are unique as part of a larger process.
I'm aware of the UNIQUE constraint, but I'm wondering if there is a better way to do the check.
I'm running the queries using psycopg2 so adding that tag on the off chance there's something in there that can help with this.
If the column is unique I can add a constraint. If the column is not unique adding the constraint will return an error.
If there is already a constraint of the same name a useful error is returned. in this case would prefer to just check for the existing constraint.
If the column is the primary key, the unique constraint can be added without error but in this case it would be preferable to just recognize that the column must be unique based on the primary key.
Code examples of this below.
DROP TABLE IF EXISTS unique_test;
CREATE TABLE unique_test (
pkey INT PRIMARY KEY,
unique_yes CHAR(1),
unique_no CHAR(1)
);
INSERT INTO unique_test (pkey, unique_yes, unique_no)
VALUES(1, 'a', 'a'),
(2, 'b', 'a');
CREATE UNIQUE INDEX CONCURRENTLY u_test_1 ON unique_test (unique_yes);
ALTER TABLE unique_test
ADD CONSTRAINT unique_target_1
UNIQUE USING INDEX u_test_1;
-- the above runs no problem
-- check what happens when column is not unique
CREATE UNIQUE INDEX CONCURRENTLY u_test_2 ON unique_test (unique_no);
ALTER TABLE unique_test
ADD CONSTRAINT unique_target_2
UNIQUE USING INDEX u_test_2;
-- returns:
-- SQL Error [23505]: ERROR: could not create unique index "u_test_2"
-- Detail: Key (unique_no)=(a) is duplicated.
CREATE UNIQUE INDEX CONCURRENTLY u_test_1 ON unique_test (unique_yes);
ALTER TABLE unique_test
ADD CONSTRAINT unique_target_1
UNIQUE USING INDEX u_test_1;
-- returns
-- SQL Error [42P07]: ERROR: relation "unique_target_1" already exists
-- test what happens if adding constrint to primary key column
CREATE UNIQUE INDEX CONCURRENTLY u_test_pkey ON unique_test (pkey);
ALTER TABLE unique_test
ADD CONSTRAINT unique_target_pkey
UNIQUE USING INDEX u_test_pkey;
-- this runs no problem but is inefficient.
If all you want to do is verify that values are unique, then use a query:
select unique_no, count(*)
from unique_test
group by unique_no
having count(*) > 1;
If it needs to be boolean output:
select not exists (
select unique_no, count(*)
from unique_test
group by unique_no
having count(*) > 1
);
If you just want a flag, you can use:
select count(*) <> count(distinct uniq_no) as duplicate_flag
from unique_test;
DELETE FROM
zoo x
USING zoo y
WHERE
x.animal_id < y.animal_id
AND x.animal = y.animal;
I think this is simpler, https://kb.objectrocket.com/postgresql/delete-duplicate-rows-in-postgresql-762 for reference

No unique or exclusion constraint matching the ON CONFLICT

I'm getting the following error when doing the following type of insert:
Query:
INSERT INTO accounts (type, person_id) VALUES ('PersonAccount', 1) ON
CONFLICT (type, person_id) WHERE type = 'PersonAccount' DO UPDATE SET
updated_at = EXCLUDED.updated_at RETURNING *
Error:
SQL execution failed (Reason: ERROR: there is no unique or exclusion
constraint matching the ON CONFLICT specification)
I also have an unique INDEX:
CREATE UNIQUE INDEX uniq_person_accounts ON accounts USING btree (type,
person_id) WHERE ((type)::text = 'PersonAccount'::text);
The thing is that sometimes it works, but not every time. I randomly get
that exception, which is really strange. It seems that it can't access that
INDEX or it doesn't know it exists.
Any suggestion?
I'm using PostgreSQL 9.5.5.
Example while executing the code that tries to find or create an account:
INSERT INTO accounts (type, person_id, created_at, updated_at) VALUES ('PersonAccount', 69559, '2017-02-03 12:09:27.259', '2017-02-03 12:09:27.259') ON CONFLICT (type, person_id) WHERE type = 'PersonAccount' DO UPDATE SET updated_at = EXCLUDED.updated_at RETURNING *
SQL execution failed (Reason: ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification)
In this case, I'm sure that the account does not exist. Furthermore, it never outputs the error when the person has already an account. The problem is that, in some cases, it also works if there is no account yet. The query is exactly the same.
Per the docs,
All table_name unique indexes that, without regard to order, contain exactly the
conflict_target-specified columns/expressions are inferred (chosen) as arbiter
indexes. If an index_predicate is specified, it must, as a further requirement
for inference, satisfy arbiter indexes.
The docs go on to say,
[index_predicate are u]sed to allow inference of partial unique indexes
In an understated way, the docs are saying that when using a partial index and
upserting with ON CONFLICT, the index_predicate must be specified. It is not
inferred for you. I learned this
here, and the following example demonstrates this.
CREATE TABLE test.accounts (
id int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
type text,
person_id int);
CREATE UNIQUE INDEX accounts_note_idx on accounts (type, person_id) WHERE ((type)::text = 'PersonAccount'::text);
INSERT INTO test.accounts (type, person_id) VALUES ('PersonAccount', 10);
so that we have:
unutbu=# select * from test.accounts;
+----+---------------+-----------+
| id | type | person_id |
+----+---------------+-----------+
| 1 | PersonAccount | 10 |
+----+---------------+-----------+
(1 row)
Without index_predicate we get an error:
INSERT INTO test.accounts (type, person_id) VALUES ('PersonAccount', 10) ON CONFLICT (type, person_id) DO NOTHING;
-- ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
But if instead you include the index_predicate, WHERE ((type)::text = 'PersonAccount'::text):
INSERT INTO test.accounts (type, person_id) VALUES ('PersonAccount', 10)
ON CONFLICT (type, person_id)
WHERE ((type)::text = 'PersonAccount'::text) DO NOTHING;
then there is no error and DO NOTHING is honored.
A simple solution of this error
First of all let's see the cause of error with a simple example. Here is the table mapping products to categories.
create table if not exists product_categories (
product_id uuid references products(product_id) not null,
category_id uuid references categories(category_id) not null,
whitelist boolean default false
);
If we use this query:
INSERT INTO product_categories (product_id, category_id, whitelist)
VALUES ('123...', '456...', TRUE)
ON CONFLICT (product_id, category_id)
DO UPDATE SET whitelist=EXCLUDED.whitelist;
This will give you error No unique or exclusion constraint matching the ON CONFLICT because there is no unique constraint on product_id and category_id. There could be multiple rows having the same combination of product and category id (so there can never be a conflict on them).
Solution:
Use unique constraint on both product_id and category_id like this:
create table if not exists product_categories (
product_id uuid references products(product_id) not null,
category_id uuid references categories(category_id) not null,
whitelist boolean default false,
primary key(product_id, category_id) -- This will solve the problem
-- unique(product_id, category_id) -- OR this if you already have a primary key
);
Now you can use ON CONFLICT (product_id, category_id) for both columns without any error.
In short: Whatever column(s) you use with on conflict, they should have unique constraint.
The easy way to fix it is by setting the conflicting column as UNIQUE
I did not have a chance to play with UPSERT, but I think you have a case from
docs:
Note that this means a non-partial unique index (a unique index
without a predicate) will be inferred (and thus used by ON CONFLICT)
if such an index satisfying every other criteria is available. If an
attempt at inference is unsuccessful, an error is raised.
I solved the same issue by creating one UNIQUE INDEX for ALL columns you want to include in the ON CONFLICT clause, not one UNIQUE INDEX for each of the columns.
CREATE TABLE table_name (
element_id UUID NOT NULL DEFAULT gen_random_uuid(),
timestamp TIMESTAMP NOT NULL DEFAULT now():::TIMESTAMP,
col1 UUID NOT NULL,
col2 STRING NOT NULL ,
col3 STRING NOT NULL ,
CONSTRAINT "primary" PRIMARY KEY (element_id ASC),
UNIQUE (col1 asc, col2 asc, col3 asc)
);
Which will allow to query like
INSERT INTO table_name (timestamp, col1, col2, col3) VALUES ('timestamp', 'uuid', 'string', 'string')
ON CONFLICT (col1, col2, col3)
DO UPDATE timestamp = EXCLUDED.timestamp, col1 = EXCLUDED.col1, col2 = excluded.col2, col3 = col3.excluded;

Define foreign key in Postgres to a subset of a target table

Example:
I have:
Table A:
int id
int table_b_id
Table B:
int id
text type
I want to add a constraint check on column table_b_id that will verify that it points only to rows in table B which their type value is 'X'.
I can't change table structure.
I've understood it can be done with 'CHECK' and a postgres functions which will do the specific query but I've saw people recommending to avoid it.
Any inputs on what is the best approach to implement it will be helpful.
What you are referring to is not a FOREIGN KEY, which, in PostgreSQL, refers to a (number of) column(s) in an other table where there is a unique index on that/those column(s), and which may have associated automatic actions when the value(s) of that/those column(s) change (ON UPDATE, ON DELETE).
You are trying to enforce a specific kind of referential integrity, similar to what a FOREIGN KEY does. You can do this with a CHECK clause and a function (because the CHECK clause does not allow sub-queries), you can also do it with table inheritance and range partitioning (refer to a child table which holds only rows where type = 'X'), but it is probably the easiest to do this with a trigger:
CREATE FUNCTION trf_test_type_x() RETURNS trigger AS $$
BEGIN
PERFORM * FROM tableB WHERE id = NEW.table_b_id AND type = 'X';
IF NOT FOUND THEN
-- RAISE NOTICE 'Foreign key violation...';
RETURN NULL;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE tr_test_type_x
BEFORE INSERT OR UPDATE ON tableA
FOR EACH ROW EXECUTE PROCEDURE trf_test_type_x();
You can create a partial index on tableB to speed things up:
CREATE UNIQUE INDEX idx_type_X ON tableB(id) WHERE type = 'X';
The most elegant solution, in my opinion, is to use inheritance to get a subtyping behavior:
PostgreSQL 9.3 Schema Setup with inheritance:
create table B ( id int primary key );
-- Instead to create a 'type' field, inherit from B for
-- each type with custom properties:
create table B_X ( -- some_data varchar(10 ),
constraint pk primary key (id)
) inherits (B);
-- Sample data:
insert into B_X (id) values ( 1 );
insert into B (id) values ( 2 );
-- Now, instead to reference B, you should reference B_X:
create table A ( id int primary key, B_id int references B_X(id) );
-- Here it is:
insert into A values ( 1, 1 );
--Inserting wrong values will causes violation:
insert into A values ( 2, 2 );
ERROR: insert or update on table "a" violates foreign key constraint "a_b_id_fkey"
Detail: Key (b_id)=(2) is not present in table "b_x".
Retrieving all data from base table:
select * from B
Results:
| id |
|----|
| 2 |
| 1 |
Retrieving data with type:
SELECT p.relname, c.*
FROM B c inner join pg_class p on c.tableoid = p.oid
Results:
| relname | id |
|---------|----|
| b | 2 |
| b_x | 1 |

Updating foreign keys while inserting into new table

I have table A(id).
I need to
create table B(id)
add a foreign key to table A that references to B.id
for every row in A, insert a row in B and update A.b_id with the newly inserted row in B
Is it possible to do it without adding a temporary column in B that refers to A? The below does work, but I'd rather not have to make a temporary column.
alter table B add column ref_id integer references(A.id);
insert into B (ref_id) select id from A;
update A set b_id = B.id from B where B.ref_id = A.id;
alter table B drop column ref_id;
Assuming that:
1) you're using postgresql 9.1
2) B.id is a serial (so actually an int with a default value of nextval('b_id_seq')
3) when inserting to B, you actually add other fields from A otherwise the insert is useless
...I think something like this would work:
with n as (select nextval('b_id_seq') as newbid,a.id as a_id from a),
l as (insert into b(id) select newbid from n returning id as b_id)
update a set b_id=l.b_id from l,n where a.id=n.a_id and l.b_id=n.newbid;
Add the future foreign key column, but without the constraint itself:
ALTER TABLE A ADD b_id integer;
Fill the new column with values:
WITH cte AS (
SELECT
id
ROW_NUMBER() OVER (ORDER BY id) AS b_ref
FROM A
)
UPDATE A
SET b_id = cte.b_ref
FROM cte
WHERE A.id = cte.id;
Create the other table:
CREATE TABLE B (
id integer CONSTRAINT PK_B PRIMARY KEY
);
Add rows to the new table using the referencing column of the existing one:
INSERT INTO B (id)
SELECT b_id
FROM A;
Add the FOREIGN KEY constraint:
ALTER TABLE A
ADD CONSTRAINT FK_A_B FOREIGN KEY (b_id) REFERENCES B (id);
PostgeSQL dialect.
You might use an anonymous code block like this
do $$
declare
category_cursor cursor for select id from schema1.categories;
r_category bigint;
setting_id bigint;
begin
open category_cursor;
loop fetch category_cursor into r_category;
exit when not found;
insert into schema2.setting(field)
values ('field_value') returning id into setting_id;
update schema1.categories set category_setting_id = setting_id
where category_id = r_category;
end loop;
end; $$
Let assume we have two tables first - categories, second - settings which must be applied to these categories.
First step - declare cursor(collect ids from categories), and variabels where we store temporary data
Loop cursor inserting values 'field_value' into settings
Store id in variable setting_id
Update table categories with setting_id