PostgreSQL Upsert with a WHERE clause - sql

I am trying to migrate an Oracle merge query to PostgreSql. As described in this article, Postgres UPSERT syntax supports a "where clause" to identify conditions of conflict.
Unfortunately, that webpage does not provide an example with the "where clause". I tried searching for it elsewhere but could not find it. Hence this question.
Following the same example in the above given webpage, here is an example setup:
CREATE TABLE customers (
customer_id serial PRIMARY KEY,
name VARCHAR UNIQUE,
email VARCHAR NOT NULL,
active bool NOT NULL DEFAULT TRUE
);
INSERT INTO customers (NAME, email) VALUES
('IBM', 'contact#ibm.com'),
('Microsoft', 'contact#microsoft.com'),
('Intel','contact#intel.com');
SELECT * FROM customers;
customer_id | name | email | active
-------------+-----------+-----------------------+--------
1 | IBM | contact#ibm.com | t
2 | Microsoft | contact#microsoft.com | t
3 | Intel | contact#intel.com | t
(3 rows)
I want my UPSERT statement to look something like this:
INSERT INTO customers (NAME, email)
VALUES
('Microsoft', 'hotline#microsoft.com')
ON CONFLICT where (name = 'Microsoft' and active = TRUE)
DO UPDATE SET email = 'hotline#microsoft.com';
The example is a bit contrived but I hope I have been able to communicate the gist here.

You need a partial index. Drop uniqe constraint on the column name and create a partial index on the column:
CREATE TABLE customers (
customer_id serial PRIMARY KEY,
name VARCHAR,
email VARCHAR NOT NULL,
active bool NOT NULL DEFAULT TRUE
);
CREATE UNIQUE INDEX ON customers (name) WHERE active;
INSERT INTO customers (NAME, email) VALUES
('IBM', 'contact#ibm.com'),
('Microsoft', 'contact#microsoft.com'),
('Intel','contact#intel.com');
The query should look like this:
INSERT INTO customers (name, email)
VALUES
('Microsoft', 'hotline#microsoft.com')
ON CONFLICT (name) WHERE active
DO UPDATE SET email = excluded.email;
SELECT *
FROM customers;
customer_id | name | email | active
-------------+-----------+-----------------------+--------
1 | IBM | contact#ibm.com | t
3 | Intel | contact#intel.com | t
2 | Microsoft | hotline#microsoft.com | t
(3 rows)
Note the proper use of the special record excluded. Per the documentation:
The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the existing row using the table's name (or an alias), and to rows proposed for insertion using the special excluded table.

Related

PostgreSQL: How to create a prefixed sequence as a unique identifier?

One of my clients insists that I create a unique identifier that starts with a given prefix, then increments by one in a postgres table. For example, PREFIX000001.
I know postgres provides SERIAL, BIGSERIAL and UUID to uniquely identify rows in a table. But the client just does not listen. He wants it his way.
Here is the sample table (Excel representation) - something like unique_id column that auto generates on every INSERT command:
I really want to know whether this is technically possible in postgres.
How should I go about this?
You could create a SERIAL or BIGSERIAL like you suggested but represent it with a string when reporting the data in the application (if the client would accept that):
SELECT to_char(id, '"PREFIX0"FM0000000') AS unique_id, product_name, product_desc FROM table;
For example:
SELECT to_char(123, '"PREFIX0"FM0000000') AS unique_id;
unique_id
----------------
PREFIX00000123
(1 row)
Time: 2.704 ms
Otherwise you would have to do this:
CREATE SEQUENCE my_prefixed_seq;
CREATE TABLE my_table (
unique_id TEXT NOT NULL DEFAULT 'PREFIX'||to_char(nextval('my_prefixed_seq'::regclass), 'FM0000000'),
product_name text,
product_desc text
);
INSERT INTO my_table (product_name) VALUES ('Product 1');
INSERT INTO my_table (product_name) VALUES ('Product 2');
INSERT INTO my_table (product_name) VALUES ('Product 3');
->
SELECT * FROM my_table;
unique_id | product_name | product_desc
---------------+--------------+--------------
PREFIX0000004 | Product 1 | {NULL}
PREFIX0000005 | Product 2 | {NULL}
PREFIX0000006 | Product 3 | {NULL}
(3 rows)
Time: 3.595 ms
I would advice you to try to make the client reconsider but it looks like you already tried that route
To whomever reads this in the future, please don't do this to your database, this is not good practice as #Beki acknowledged in his question
As Gab says that's a pretty cumbersome thing to do. If you also want to keep a normal primary key for internal use in your app, here's a solution:
CREATE OR REPLACE FUNCTION add_prefix(INTEGER) RETURNS text AS
$$ select 'PREFIX'||to_char($1, 'FM0000000'); $$
LANGUAGE sql immutable;
CREATE TABLE my_table (
id SERIAL PRIMARY KEY,
unique_id TEXT UNIQUE NOT NULL GENERATED ALWAYS AS
(add_prefix(id)) STORED,
product_name text
);
INSERT INTO my_table (product_name) VALUES ('Product 1');
INSERT INTO my_table (product_name) VALUES ('Product 2');
INSERT INTO my_table (product_name) VALUES ('Product 3');
select * from my_table;
id | unique_id | product_name
----+---------------+--------------
1 | PREFIX0000001 | Product 1
2 | PREFIX0000002 | Product 2
3 | PREFIX0000003 | Product 3
Sure, you get an extra index gobbling up RAM and disk space for nothing. But, when the client then inevitably asks you "I want to update the unique identifier" in a few months...
Or even worse, "why are there holes in the sequence can't you make it so there are no holes"...
...then you won't have to update ALL the relations in all the tables...
One method uses a sequence:
create sequence t_seq;
create table t (
unique_id varchar(255) default ('PREFIX' || lpad(nextval('t_seq')::text, 6, '0'));
)
Here is a db<>fiddle.

Generate unique IDs in non-unique columns

Consider this table:
ex_table
| gid | val |
| --- | --- |
| 1 | v1 |
| 1 | v2 |
| 2 | v3 |
Notice that gid is the id-like column and not unique.
I want to be able to insert values into the table either by generating a unique gid or by specifying which one to use.
For example:
INSERT INTO ex_table (val)
SELECT --....
Should generate a unique gid, while
INSERT INTO ex_table (gid, val)
SELECT --....
Should use the provided gid.
Any way to do this?
You can do what you want to the letter of what you say by using overriding system value and an auto-generated column. For instance:
create table t (
gid int generated always as identity,
name varchar(255)
);
Then
insert into t (name) values ('abc');
insert into t (gid, name) overriding system value values (1, 'def')
will insert two rows with a gid value of 1.
Here is an example.
Just one caveat: Inserting your own value does not change the next value that is automatically generated. So, if you manually insert values that do not exist, then you might find that duplicates are later generated for them.
You can try something like this
CREATE SEQUENCE table_name_id_seq;
CREATE TABLE table_name (
gid integer NOT NULL DEFAULT nextval('table_name_id_seq'),
name varchar);
ALTER SEQUENCE table_name_id_seq
OWNED BY table_name.id;
OR SIMPLY
CREATE TABLE table_name(
gid SERIAL,
name varchar);
AND THEN TO INSERT
INSERT INTO fruits(gid,name)
VALUES(DEFAULT,'Apple');

Insert multiple values with foreign key Postgresql

I am having trouble figuring out how to insert multiple values to a table, which checks if another table has the needed values stored. I am currently doing this in a PostgreSQL server, but will be implementing it in PreparedStatements for my java program.
user_id is a foreign key which references the primary in mock2. I have been trying to check if mock2 has values ('foo1', 'bar1') and ('foo2', 'bar2').
After this I am trying to insert new values into mock1 which would have a date and integer value and reference the primary key of the row in mock2 to the foreign key in mock1.
mock1 table looks like this:
===============================
| date | time | user_id |
| date | integer | integer |
| | | |
And the table mock2 is:
==================================
| Id | name | program |
| integer | text | test |
Id is a primary key for the table and the name is UNIQUE.
I've been playing around with this solution https://dba.stackexchange.com/questions/46410/how-do-i-insert-a-row-which-contains-a-foreign-key
However, I haven't been able to make it work. Could someone please point out what the correct syntax is for this, I would be really appreciative.
EDIT:
The create table statements are:
CREATE TABLE mock2(
id SERIAL PRIMARY KEY UNIQUE,
name text NOT NULL,
program text NOT NULL UNIQUE
);
and
CREATE TABLE mock1(
date date,
time_spent INTEGER,
user_id integer REFERENCES mock2(Id) NOT NULL);
Ok so I found an answer to my own question.
WITH ins (date,time_spent, id) AS
( VALUES
( '22/08/2012', 170, (SELECT id FROM mock3 WHERE program ='bar'))
)
INSERT INTO mock4
(date, time_spent, user_id)
SELECT
ins.date, ins.time_spent, mock3.id
FROM
mock3 JOIN ins
ON ins.id = mock3.id ;
I was trying to take the 2 values from the first table, match these and then insert 2 new values to the next table, but I realised that I should be using the Primary and Foreign keys to my advantage.
I instead now JOIN on the ID and then just select the key I need by searching it from the values with (SELECT id FROM mock3 WHERE program ='bar') in the third row.

Schema created but tables therein cannot be displayed

Here is the code:
testdb=# create schema myschema1;
CREATE SCHEMA
testdb=# \d
List of relations
Schema | Name | Type | Owner
--------+------------+-------+--------
public | company | table | kaiyin
public | department | table | kaiyin
(2 rows)
testdb=#
testdb=# create table myschema1.company1(
testdb(# ID INT NOT NULL,
testdb(# NAME VARCHAR (20) NOT NULL,
testdb(# AGE INT NOT NULL,
testdb(# ADDRESS CHAR (25) ,
testdb(# SALARY DECIMAL (18, 2),
testdb(# PRIMARY KEY (ID)
testdb(# );
CREATE TABLE
testdb=# \d
List of relations
Schema | Name | Type | Owner
--------+------------+-------+--------
public | company | table | kaiyin
public | department | table | kaiyin
(2 rows)
testdb=# select * from myschema1.company1
Nothing appears. Maybe it's because the table is empty?
testdb=# insert into myschema1.company1 values (1, joyce, 23, amsterdam, 60000, 1)
testdb-# select * from myschema1.company1
testdb-#
Still nothing. Why?
OS X, postgresql 9.4.4
The \d command in psql uses the current search list of schemas, which by default is public. You need to tell it to search your schema: \d myschema1.*.
You need to terminate a statement with a ;
As written you are actually writing a multi-line statement. That's why your prompt turned from testdb=# to testdb-#, it means you're still writing your statement. (It usefully does the same with braces too, and gives a prompt of testdb(# if you're still within a braced-block.)
Press <ctrl>-<c> to cancel your current statement, then try it as follows.
insert into myschema1.company1 values (1, joyce, 23, amsterdam, 60000, 1);
select * from myschema1.company1;

Redshift psql auto increment on even number

I am trying to create a table with an auto-increment column as below. Since Redshift psql doesn't support SERIAL, I had to use IDENTITY data type:
IDENTITY(seed, step)
Clause that specifies that the column is an IDENTITY column. An IDENTITY column contains unique auto-generated values. These values start with the value specified as seed and increment by the number specified as step. The data type for an IDENTITY column must be either INT or BIGINT.`
My create table statement looks like this:
CREATE TABLE my_table(
id INT IDENTITY(1,1),
name CHARACTER VARYING(255) NOT NULL,
PRIMARY KEY( id )
);
However, when I tried to insert data into my_table, rows increment only on the even number, like below:
id | name |
----+------+
2 | anna |
4 | tom |
6 | adam |
8 | bob |
10 | rob |
My insert statements look like below:
INSERT INTO my_table ( name )
VALUES ( 'anna' ), ('tom') , ('adam') , ('bob') , ('rob' );
I am also having trouble with bringing the id column back to start with 1. There are solutions for SERIAL data type, but I haven't seen any documentation for IDENTITY.
Any suggestions would be much appreciated!
You have to set your identity as follows:
id INT IDENTITY(0,1)
Source: http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_examples.html
And you can't reset the id to 0. You will have to drop the table and create it back again.
Set your seed value to 1 and your step value to 1.
Create table
CREATE table my_table(
id bigint identity(1, 1),
name varchar(100),
primary key(id));
Insert rows
INSERT INTO organization ( name )
VALUES ('anna'), ('tom') , ('adam'), ('bob'), ('rob');
Results
id | name |
----+------+
1 | anna |
2 | tom |
3 | adam |
4 | bob |
5 | rob |
For some reason, if you set your seed value to 0 and your step value to 1 then the integer will increase in steps of 2.
Create table
CREATE table my_table(
id bigint identity(0, 1),
name varchar(100),
primary key(id));
Insert rows
INSERT INTO organization ( name )
VALUES ('anna'), ('tom') , ('adam'), ('bob'), ('rob');
Results
id | name |
----+------+
0 | anna |
2 | tom |
4 | adam |
6 | bob |
8 | rob |
This issue is discussed at length in AWS forum.
https://forums.aws.amazon.com/message.jspa?messageID=623201
The answer from the AWS.
Short answer to your question is seed and step are only honored if you
disable both parallelism and the COMPUPDATE option in your COPY.
Parallelism is disabled if and only if you're loading your data from a
single file, which is what we normally do not recommend, and hence
will be an unlikely scenario for most users.
Parallelism impacts things because in order to ensure that there is no
single point of contention in assigning identity values to rows, there
end up being gaps in the value assignment. When parallelism is
disabled, the load is happening serially, and therefore, there is no
issue with assigning different id values in parallel.
The reason COMPUPDATE impacts things is when it's enabled, the COPY is
actually making 2 passes over your data. During the first pass, it
internally increments the identity values, and as a result, your
initial value starts with a larger value than you'd expect.
We'll update the doc to reflect this.
Also multiple nodes seems to cause such effect with IDENTITY column. In essence it can only provide you with guaranteed unique IDs.