How to UPSERT multiple rows with individual values in one statement? - sql

I've created the following table:
CREATE TABLE t1(
a INT UNIQUE,
b varchar(100) NOT NULL,
c INT,
d INT DEFAULT 0,
PRIMARY KEY (a,b));
On a single row this SQL statement works great (the SQL is generated in code):
INSERT INTO t1 (a, b, c, d)
VALUES($params.1, '${params.2}', $params.3, params.4)
ON CONFLICT (a,b) DO
UPDATE SET d=params.4
Is it possible to upsert multiple rows at once? For each update the value of params.4 is different.
var sqlStr = 'INSERT INTO t1 (a, b, c, d) VALUES '
for(let i =0 i < params.length; i++){
sqlStr += `(${params[i].1}, '${params[i].2}', ${params[i].3}, ${params[i].4}),`
}
sqlStr = sqlStr.substring(0, sqlStr .length - 2) +')';
sqlStr += 'ON CONFLICT (a,b) DO UPDATE SET **d=???**' <-- this is the problem
params[i].4 has different value for each row and the ON CONFLICT statement appears only once (not per row) and SET doesn't support a WHERE.
Example, if my table has the following rows:
a | b | c | d
---+---+---+---
1 | 1 | 1 | 1
2 | 2 | 2 | 2
And my new input is [(1,'1',1,11),(2,'2',2,22),(3,'3',3,3)].
There are two conflicts - (1,1) and (2,2). The result should be:
a | b | c | d
---+---+---+---
1 | 1 | 1 | 11
2 | 2 | 2 | 22
3 | 3 | 3 | 3

UPSERT (INSERT ... ON CONFLICT ... DO UPDATE) keeps track of excluded rows in the special table named EXCLUDED automatically. The manual:
Note that the special excluded table is used to reference values originally proposed for insertion
So it's really very simple:
INSERT INTO t1 (a, b, c, d)
VALUES (...)
ON CONFLICT (a,b) DO UPDATE
SET d = EXCLUDED.d; -- that's all !!!
Besides being much simpler and faster, there is a subtle corner-case difference from your proposed solution. The manual:
Note that the effects of all per-row BEFORE INSERT triggers are
reflected in excluded values, since those effects may have contributed
to the row being excluded from insertion.
Plus, column DEFAULT values are already applied in the EXCLUDED row, where no input was given. (Like DEFAULT 0 for your column d.)
Both are typically what you want.

Writing the solution for future users dealing with the same issue
Step 1: Create a temporary table (clone of t1)
CREATE TABLE t2(
a INT UNIQUE,
b varchar(100) NOT NULL,
c INT,
d INT DEFAULT 0,
PRIMARY KEY (a,b));
OR
create table t2 as (select * from t1) with no data;
Step 2: Insert the new input to t2
INSERT INTO t2 (a, b, c, d)
values (1,'1',1,1),(2,'2',2,2),(3,'3',3,3)`
Step 3: UPSERT to t1 in case of conflict select d from t2
INSERT INTO t1 (a, b, c, d)
VALUES (1,'1',1,1),(2,'2',2,2),(3,'3',3,3)
ON CONFLICT(a,b) DO UPDATE
SET d=(SELECT d FROM t2 WHERE t2.a=t1.a);
Step 4: delete t2
DROP TABLE t2;

Related

Updating a list of data in a normalised table postgres

I have two tables. One is table A which contains an id. Table B is a normalised table that contains a foreign key to table A and some other column called value.
e.g.
Table B
| id | fk| value
Table A
|pk| ... |
Basically I have a list (of n length) of values that I want to insert into table B that are to one foreignKey e.g list = [a, b, c, d] key = 1. The problem is table B might already have these values so I only want to insert the ones that aren't already in that table, as well as delete the ones that aren't in my list.
list = [a, b, c, d], key = 1
table B
| id |fk | value
| 1 | 1 | a
| 2 | 1 | b
| 3 | 1 | e
Is there a way that I can insert only c and d from the list into the table and delete e from the table in one statement? My current attempt is to delete every entry that matches the key and then insert them all but I don't think this is the efficient way to do this.
Why not just truncate b and insert the new values?
truncate table b;
insert into b (fk, value)
<your list here>;
Or if key is a column in b and you want to delete all keys with a given value:
delete from b where key = 1;
insert into b (fk, value, key)
<your list here with "1" for the key>
This doesn't preserve the id column from b, but your question does not mention that as being important.
An alternative method would use CTEs:
with data(fk, value) as (
<your list here>
),
d as (
delete from b
where (b.fk, b.value) not in (select d.fk, d.value from data d)
)
insert into b (fk, value)
select d.fk, d.value
from data d
where (d.fk, d.value) not in (select b.fk, b.value from b);

SQL Select Where Opposite Match Does Not Exist

Trying to compare between two columns and check if there are no records that exist with the reversal between those two columns. Other Words looking for instances where 1-> 3 exists but 3->1 does not exist. If 1->2 and 2->1 exists we will still consider 1 to be part of the results.
Table = Betweens
start_id | end_id
1 | 2
2 | 1
1 | 3
1 would be added since it is a start to an end with no opposite present of 3,1. Though it did not get added until the 3rd entry since 1 and 2 had an opposite.
So, eventually it will just return names where the reversal does not exist.
I then want to join another table where the number from the previous problem has its name installed on it.
Table = Names
id | name
1 | Mars
2 | Earth
3 | Jupiter
So results will just be the names of those that don't have an opposite.
You can use a not exists condition:
select t1.start_id, t1.end_id
from the_table t1
where not exists (select *
from the_table t2
where t2.end_id = t1.start_id
and t2.start_id = t1.end_id);
I'm not sure about your data volume, so with your ask, below query will supply desired result for you in Sql Server.
create table TableBetweens
(start_id INT,
end_id INT
)
INSERT INTO TableBetweens VALUES(1,2)
INSERT INTO TableBetweens VALUES(2,1)
INSERT INTO TableBetweens VALUES(1,3)
create table TableNames
(id INT,
NAME VARCHAR(50)
)
INSERT INTO TableNames VALUES(1,'Mars')
INSERT INTO TableNames VALUES(2,'Earth')
INSERT INTO TableNames VALUES(3,'Jupiter')
SELECT *
FROM TableNames c
WHERE c.id IN (
SELECT nameid1.nameid
FROM (SELECT a.start_id, a.end_id
FROM TableBetweens a
LEFT JOIN TableBetweens b
ON CONCAT(a.start_id,a.end_id) = CONCAT(b.end_id,b.start_id)
WHERE b.end_id IS NULL
AND b.start_id IS NULL) filterData
UNPIVOT
(
nameid
FOR id IN (filterData.start_id,filterData.end_id)
) AS nameid1
)

How to give a database constraint to ensure this behavior in a table?

I have a table with five columns: A, B, C, D and E.
And I need to comply with the following restrictions:
A is the primary key.
For a B there can only be one C, ie: 1-1 ; 2-1 ; 3-2 but not 1-2.
B-C and D can take any value but can not be repeated, ie: 1-1 1 ; 1-1 2 ; not 1-1 1 again.
E can take any value.
So, considering the following order
| A | B | C | D | E |
| 1 | 1 | 1 | 1 | 1 | -> OK
| 2 | 1 | 2 | 1 | 1 | -> Should fail, because there is a B with another C, 1-2 must be 1-1.
| 3 | 1 | 1 | 2 | 1 | -> OK
| 4 | 1 | 1 | 2 | 1 | -> Should fail, because relation between B-C and D is repeated.
| 5 | 2 | 1 | 1 | 1 | -> OK
Is there any way to comply with this behavior through some constraint in the database?
Thanks!
A and E are irrelevant to the question and can be ignored.
The BCD rule can be easily solved by creating a unique index on BCD.
If for every B there can be only one C then your DB is not normalized. Create a new table with B and C. Make B the primary key or create a unique index on B. Then remove C from the original table. (At which point the unique index on BCD becomes a unique index on BD.)
Without normalizing the tables, I don't think there's any way to do it with a constraint. You could certainly do it with a trigger or with code.
For B - C rule I would create a trigger
For the B - C - D rule looks like you want unique constraint
ALTER TABLE t ADD CONSTRAINT uni_BCD UNIQUE (B,C,D);
This condition is not trivial
For a B there can only be one C, ie: 1-1 ; 2-1 ; 3-2 but not 1-2.
, since Oracle does not support CREATE ASSERTION (soon, we hope!)
Therefore, you need to involve a second table to enforce this constraint, or else a statement-level AFTER INSERT/UPDATE trigger.
What I would do is create a second table and have it maintained via an INSTEAD OF trigger on a view, and ensure all my application DML happened via the view. (You could also just create a regular trigger on the table and have it maintain the second table. That's just not my preference. I find INSTEAD OF triggers to be more flexible and more visible.)
In case it's not clear, the purpose of the second table is that it allows you to enforce your constraint as a FOREIGN KEY constraint. The UNIQUE or PRIMARY KEY constraint on the second table ensures that each value of B appears only once.
Here's sample code for that approach:
--DROP TABLE table1_parent;
--DROP TABLE table1;
CREATE TABLE table1_parent
( b number NOT NULL,
c number NOT NULL,
constraint table1_parent_pk PRIMARY KEY (b),
constraint table1_parent_u1 UNIQUE (b, c) );
CREATE TABLE table1
(
a NUMBER NOT NULL,
b NUMBER NOT NULL,
c NUMBER NOT NULL,
d NUMBER NOT NULL,
e NUMBER NOT NULL,
CONSTRAINT table1_pk PRIMARY KEY (a), -- "A is the primary key."
CONSTRAINT table1_fk FOREIGN KEY ( b, c ) REFERENCES table1_parent ( b, c ), -- "For a B there can only be one C, ie: 1-1 ; 2-1 ; 3-2 but not 1-2."
CONSTRAINT table1_u2 UNIQUE ( b, c, d ) -- "B-C and D can take any value bue can not be repeated, ie: 1-1 1 ; 1-1 2 ; not 1-1 1 again."
);
CREATE INDEX table1_n1 ON table1 (b,c); -- Always index foreign keys
CREATE OR REPLACE VIEW table1_dml_v AS SELECT * FROM table1;
CREATE OR REPLACE TRIGGER table1_dml_v_trg INSTEAD OF INSERT OR UPDATE OR DELETE ON table1_dml_v
DECLARE
l_cnt NUMBER;
BEGIN
IF INSERTING THEN
BEGIN
INSERT INTO table1_parent (b, c) VALUES ( :new.b, :new.c );
EXCEPTION
WHEN dup_val_on_index THEN
NULL; -- parent already exists, no problem
END;
INSERT INTO table1 ( a, b, c, d, e ) VALUES ( :new.a, :new.b, :new.c, :new.d, :new.e );
END IF;
IF DELETING THEN
DELETE FROM table1 WHERE a = :old.a;
SELECT COUNT(*) INTO l_cnt FROM table1 WHERE b = :old.b AND c = :old.c;
IF l_cnt = 0 THEN
DELETE FROM table1_parent WHERE b = :old.b AND c = :old.c;
END IF;
END IF;
IF UPDATING THEN
BEGIN
INSERT INTO table1_parent (b, c) VALUES ( :new.b, :new.c );
EXCEPTION
WHEN dup_val_on_index THEN
NULL; -- parent already exists, no problem
END;
UPDATE table1 SET a = :new.a, b = :new.b, c = :new.c, d = :new.d, e = :new.d WHERE a = :old.a;
SELECT COUNT(*) INTO l_cnt FROM table1 WHERE b = :old.b AND c = :old.c;
IF l_cnt = 0 THEN
DELETE FROM table1_parent WHERE b = :old.b AND c = :old.c;
END IF;
END IF;
END;
insert into table1_dml_v ( a,b,c,d,e) VALUES (1,1,1,1,1);
insert into table1_dml_v ( a,b,c,d,e) VALUES (2,1,2,1,1);
insert into table1_dml_v ( a,b,c,d,e) VALUES (3,1,1,2,1);
insert into table1_dml_v ( a,b,c,d,e) VALUES (4,1,1,2,1);
insert into table1_dml_v ( a,b,c,d,e) VALUES (5,2,1,1,1);
If your system supports fast refreshed materialized views, please try the following.
Since I currently don't access to a this feature, I can't verify the solution.
create materialized view log on t with primary key;
create materialized view t_mv
refresh fast
as
select b,c
from t
group by b,c
;
alter table t_mv add constraint t_mv_uq_b unique (b);
and off course:
alter table t add constraint t_uq_a_b_c unique (b,c,d);

High performance PostgreSQL: Calculate the difference between two sets of key-value tables and store the result

You may consider me a PostgreSQL beginner, and the purpose of this question is to get insights into how to get the best performance out of PostgreSQL for this problem. I have two tables which are identical in their structure but differ in their content.
|Table A|
key - value
1 dave
2 paul
3 michael
|Table B|
key - value
1 dave
2 chris
The problem is simple, to replace table A with table B, but to know which entries were inserted into or removed from table A in the operation.
My first (naive) solution involves doing the work in two stages using table joins to produce the intermediate lists for first the delete and then the insert operations. The results of those queries are stored on the client and are required for correct application function.
SELECT * FROM A LEFT JOIN B ON A.value = B.value WHERE B.value IS NULL;
DELETE FROM A WHERE value IN ("paul", "michael");
SELECT * FROM B LEFT JOIN A ON A.value = B.value WHERE A.value IS NULL;
INSERT INTO A (value) VALUES "chris";
This simple approach does technically work, by the end of the transaction table A will contain the same content as table B, but this strategy quickly becomes quite slow. To give an indication of the size of the tables, it's in the range of millions of rows, so performance at scale is a critical factor, and it would be nice to find a more optimal approach.
In order to address performance requirements, I plan to investigate the following:
Use of HStore back-end for optimal key-value storage performance.
Use of views for pre-calculating intermediate delete/insert queries.
Use of prepared queries to reduce SQL processing overhead.
My question to the experts is can you suggest what you consider to be the optimal strategy. Going slightly beyond the scope of my question, are there any hard and fast rules you can suggest?
Thank you so much for your time. All feedback is very welcome.
This is not perfect, but it works. The thee cases (delete,update,insert) could possibly be combined into a full outer join.
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE table_a (
zkey INTEGER NOT NULL PRIMARY KEY
, zvalue varchar NOT NULL
, CONSTRAINT a_zvalue_alt UNIQUE (zvalue)
);
INSERT INTO table_a(zkey, zvalue) VALUES
(1, 'dave' )
,(2, 'paul' )
,(3, 'michael' )
;
CREATE TABLE table_b (
zkey INTEGER NOT NULL PRIMARY KEY
, zvalue varchar NOT NULL
, CONSTRAINT b_zvalue_alt UNIQUE (zvalue)
);
INSERT INTO table_b(zkey, zvalue) VALUES
(1, 'dave' )
,(2, 'chris' )
,(5, 'Arnold' )
;
CREATE TABLE table_diff (
zkey INTEGER NOT NULL
, zvalue varchar NOT NULL
, opcode INTEGER NOT NULL DEFAULT 0
);
WITH xx AS (
DELETE FROM table_a aa
WHERE NOT EXISTS (
SELECT * FROM table_b bb
WHERE bb.zkey = aa.zkey
)
RETURNING aa.zkey, aa.zvalue
)
INSERT INTO table_diff(zkey,zvalue,opcode)
SELECT xx.zkey, xx.zvalue, -1
FROM xx
;
SELECT * FROM table_diff;
WITH xx AS (
UPDATE table_a aa
SET zvalue= bb.zvalue
FROM table_b bb
WHERE bb.zkey = aa.zkey
AND bb.zvalue <> aa.zvalue
RETURNING aa.zkey, aa.zvalue
)
INSERT INTO table_diff(zkey,zvalue,opcode)
SELECT xx.zkey, xx.zvalue, 0
FROM xx
;
SELECT * FROM table_diff;
WITH xx AS (
INSERT INTO table_a (zkey, zvalue)
SELECT bb.zkey, bb.zvalue
FROM table_b bb
WHERE NOT EXISTS (
SELECT * FROM table_a aa
WHERE bb.zkey = aa.zkey
AND bb.zvalue = aa.zvalue
)
RETURNING zkey, zvalue
)
INSERT INTO table_diff(zkey,zvalue,opcode)
SELECT xx.zkey, xx.zvalue, 1
FROM xx
;
SELECT * FROM table_a;
SELECT * FROM table_b;
SELECT * FROM table_diff;
Result:
INSERT 0 3
CREATE TABLE
INSERT 0 1
zkey | zvalue | opcode
------+---------+--------
3 | michael | -1
(1 row)
INSERT 0 1
zkey | zvalue | opcode
------+---------+--------
3 | michael | -1
2 | chris | 0
(2 rows)
INSERT 0 1
zkey | zvalue
------+--------
1 | dave
2 | chris
5 | Arnold
(3 rows)
zkey | zvalue
------+--------
1 | dave
2 | chris
5 | Arnold
(3 rows)
zkey | zvalue | opcode
------+---------+--------
3 | michael | -1
2 | chris | 0
5 | Arnold | 1
(3 rows)
BTW: the OQ is very vague about requirements. If the table_diff would be an actual history table, at least a timestamp-column should be added, and zkey and ztimestamp would be a natural choice for a key. Also, the whole process could be wrapped in a set of rules or triggers.
Try using this queries:
DELETE FROM A
WHERE A.value NOT IN (SELECT B.value FROM B);
INSERT INTO A(value)
SELECT B.value
FROM B
WHERE B.value NOT IN (SELECT A.value FROM A)
With indexes on A.value and B.value this queries will be really fast.
If you have value indexed in both tables, and value is unique in each table, this is a case for a full outer join, which should be able to merge the two by walking through the indices:
SELECT CASE WHEN B.value IS NULL THEN
'DELETE FROM A WHERE A.value = ' || quote_literal(A.value)
ELSE
'INSERT INTO A(value) VALUES(' || quote_literal(B.value) || ')'
END
FROM A FULL OUTER JOIN B ON A.value = B.value
WHERE A.value IS DISTINCT FROM B.value
The SQL generation here is really just to demo what the output of the query is.

Updating duplicate values;

I have a table which contains unique indexes;lets say
A B
1 1
1 2
1 3
1 4
And i want to update the B column,since sql updates them one by one if it tries to update the first column to 2 for ex i get::
A B
1 2
1 2
1 3
1 4
And as a result i would get two duplicate values at the 1st and 2nd row and of course an error message.
And i should update the table like that:
A B
1 2
1 1
1 3
1 4
So whats the course of action i should follow in case of this scenario?
Regards.
Maybe i should update the question abit:
What if i wanted to change the b column completely; such as:
A B
1 4
1 2
1 3
1 1
Try this
UPDATE tbl
SET B = 3 - B
WHERE A = 1 AND B IN (1, 2)
Or, generaly, you can use something like that:
UPDATE tbl
SET B = CASE B
WHEN 1 THEN 2
WHEN 2 THEN 1
END
WHERE A = 1 AND B IN (1, 2)
Another way:
add column C
through your loop fill C column with new values
update field B from C:
UPDATE tbl
SET B = C
The solution is to do the swap in a single statement:
UPDATE YOUR_TABLE
SET B = (CASE B WHEN 1 THEN 2 ELSE 1 END)
WHERE A = 1 AND B IN (1, 2)
--- EDIT ---
To update several rows at once, you can do a JOIN-update from the temporary table:
CREATE TABLE #ROWS_TO_UPDATE (
A int,
B int,
NEW_B int,
PRIMARY KEY (A, B)
);
INSERT INTO #ROWS_TO_UPDATE (A, B, NEW_B) VALUES (1, 1, 4);
INSERT INTO #ROWS_TO_UPDATE (A, B, NEW_B) VALUES (1, 2, 3);
INSERT INTO #ROWS_TO_UPDATE (A, B, NEW_B) VALUES (1, 3, 2);
INSERT INTO #ROWS_TO_UPDATE (A, B, NEW_B) VALUES (1, 4, 1);
UPDATE YOUR_TABLE
SET B = NEW_B
FROM
YOUR_TABLE JOIN #ROWS_TO_UPDATE
ON YOUR_TABLE.A = #ROWS_TO_UPDATE.A AND YOUR_TABLE.B = #ROWS_TO_UPDATE.B;
DROP TABLE #ROWS_TO_UPDATE;
The above code transforms the following data...
A B
1 1
1 2
1 3
1 4
...to this:
A B
1 4
1 2
1 3
1 1
If I understand well, you have a primary key composed of two columns, and you want to swap the two first rows' PK.
If you don't have foreing keys which refers to this primary key, simply change one of the keys to a temporary unused value:
A B
1 10000
1 2
then change the second row:
A B
1 10000
1 1
Finally, change the first one:
A B
1 2
1 1
If you have objects depending on this primary key, you would have to make a copy of the other columns (for example the other columns of 1 1) to a "temporary row", copy the data of the second (1 2) to the first (1 1) and finally copy the "temporary row" to the second (1 2)
This all dependes on how and what you're exactly trying to do. Is it a stored procedure, is it a query... You should show more context.
You could apply this technique to an unlimited number of rows. You can also create a temporary table with key equivalences, and update your table from tha temporary table. So, it will be done in an atomic operation, and it won't violate the PK.
create table T
(A int, B int, C char(5),
primary key (A,B))
insert into T values(1,1,'first')
insert into T values(1,2,'secon')
insert into T values(1,3,'third')
create table #KeyChanges
(A int, B int, newA int, newB int)
insert into #KeyChanges values(1,1,1,3)
insert into #KeyChanges values(1,2,1,1)
insert into #KeyChanges values(1,3,1,2)
update T set T.A = KC.newA, T.B = KC.newB
from T
left join #KeyChanges as KC on T.A = KC.A and T.B = KC.B