SQLite Update Execution Order with UNIQUE - sql

I am trying to do a bulk update to a table that has a UNIQUE constraint on the column I'm updating. Suppose the table is defined by:
CREATE TABLE foo (id INTEGER PRIMARY KEY, bar INTEGER UNIQUE);
Suppose the database contains a series of rows with contiguous integer values in the bar column ranging from 1 to 100, and that they've been inserted sequentially.
Suppose I want put a five-wide gap in the "bar" sequence starting at 17, for example with a query such as this:
UPDATE foo SET bar = bar + 5 WHERE bar > 17;
SQLite refuses to execute this update, saying "Error: UNIQUE constraint failed: foo.bar" All right, sure, if the query is executed one row at a time and starts at the first row that meets the WHERE clause, indeed the UNIQUE constraint will be violated: two rows will have a bar column with a value of 23 (the row where bar was 18, and the original row where bar is 23). But if I could somehow force SQLite to run the update bottom-up (start at the highest value for row and work backward), the UNIQUE constraint would not be violated.
SQLite has an optional ORDER BY / LIMIT clause for UPDATE, but that doesn't affect the order in which the UPDATEs occur; as stated at the bottom of this page, "the order in which rows are modified is arbitrary."
Is there some simple way to suggest to SQLite to process row updates in a certain order? Or do I have to use a more convoluted route such as a subquery?
UPDATE: This does not work; the same error appears:
UPDATE foo SET bar = bar + 5 WHERE bar IN
(SELECT bar FROM foo WHERE bar > 17 ORDER BY bar DESC);

If moving the unique constraint out of the table definition into its own independent index is feasible, implementing Ben's idea becomes easy:
CREATE TABLE foo(id INTEGER PRIMARY KEY, bar INTEGER);
CREATE UNIQUE INDEX bar_idx ON foo(bar);
-- Do stuff
DROP INDEX bar_idx;
-- Update bar values
CREATE UNIQUE INDEX bar_idx ON foo(bar); -- Restore the unique index
If not, something like
CREATE TEMP TABLE foo_copy AS SELECT * FROM foo;
-- Update foo_copy's bar values
DELETE FROM foo;
INSERT INTO foo SELECT * FROM foo_copy;

An alternative that doesn't require the table to be changed is to have an intermediate update that sets the new values to be in a range not covered by the range (easy if no values can be negative) that exists and to then update the values to what they should be.
e.g. the following demonstrates this using negative intermediate values :-
-- Load the data
DROP TABLE IF EXISTS foo;
CREATE TABLE foo (id INTEGER PRIMARY KEY, bar INTEGER UNIQUE);
WITH RECURSIVE cte1(x) AS (SELECT 1 UNION ALL SELECT x + 1 FROM cte1 LIMIT 100)
INSERT INTO foo (bar) SELECT * FROM cte1;
-- Show the original data
SELECT * FROM foo;
UPDATE foo SET bar = 0 - (bar + 5) WHERE bar > 17;
UPDATE foo SET bar = 0 - bar WHERE bar < 0;
-- Show the end result
SELECT * FROM foo;
Result 1 - Original Data
Result 2 - Updated data :-

Related

How do I add an auto incrementing column to an existing vertica table?

I have a table that currently has the following structure
id, row1
(null), 232
(null), 4455
(null), 16
I'd like for id to be an auto incrementing primary key, as follows:
id, row1
1, 232
2, 4455
3, 16
I've read the documentation and it looks like the function that I need is AUTO_INCREMENT and that I can edit the table using an ALTER TABLE statement. However, I can't seem to get the syntax quite right. How do I go about doing this? Is it even possible with a pre-existing table?
What you need to do is the following:
create a new sequence:
CREATE SEQUENCE sequence_auto_increment START 1;
create a new table:
create table tab2 as select * from tab1 limit 0;
insert the data:
insert /*+ direct */ into tab2
select NEXTVAL('sequence_auto_increment'),row1 from tab1;
as #Kermit mentioned the best way to do it in Vertica is to recreate the table(once) instead of multiple times, use the direct hint so you skip the WOS storage(much faster)
As for the column constraint that #Nazmul created, i won't use it Vertica doesn't care to much about constraints, you will need to force him to insert what you want and default constraints are not the way.
You need to update your exiting data something like below
UPDATE table
SET id = table2.id
FROM
(
SELECT row1, RANK() OVER (ORDER BY val) as id
FROM t1;
) as table2
where table.primaryKey = table2.primaryKey
Then you do alter your table using below syntax
-- get the value to start sequence at
SELECT MAX(id) FROM t2;
-- create the sequence
CREATE SEQUENCE seq1 START 5;
-- syntax as of 6.1
-- modify the column to add next value for future rows
ALTER TABLE t2 ALTER COLUMN id SET DEFAULT NEXTVAL('seq1');
If you want to use the Auto_Increment feature,
1)Copy data to temp table
2)Recreate the base table with the column using auto increment
3)Copy back the data to for other columns
If you just want the numbers in, refer the other answer by Nazmul

INSERT new row if value does not exist and get id either way

I would like to insert a record into a table and if the record is already present get its id, otherwise run the insert and get the new record's id.
I will be inserting millions of records and have no idea how to do this in an efficient manner. What I am doing now is to run a select to check if the record is already present, and if not, insert it and get the inserted record's id. As the table is growing I imagine that SELECT is going to kill me.
What I am doing now in python with psycopg2 looks like this:
select = ("SELECT id FROM ... WHERE ...", [...])
cur.execute(*select)
if not cur.rowcount:
insert = ("INSERT INTO ... VALUES ... RETURNING id", [...])
cur.execute(*insert)
rid = cur.fetchone()[0]
Is it maybe possible to do something in a stored procedure like this:
BEGIN
EXECUTE sql_insert;
RETURN id;
EXCEPTION WHEN unique_violation THEN
-- return id of already existing record
-- from the exception info ?
END;
Any ideas of how optimize a case like this?
First off, this is obviously not an UPSERT as UPDATE was never mentioned. Similar concurrency issues apply, though.
There will always be a race condition for this kind of task, but you can minimize it to an extremely tiny time slot, while at the same time querying for the ID only once with a data-modifying CTE (introduced with PostgreSQL 9.1):
Given a table tbl:
CREATE TABLE tbl(tbl_id serial PRIMARY KEY, some_col text UNIQUE);
Use this query:
WITH x AS (SELECT 'baz'::text AS some_col) -- enter value(s) once
, y AS (
SELECT x.some_col
, (SELECT t.tbl_id FROM tbl t WHERE t.some_col = x.some_col) AS tbl_id
FROM x
)
, z AS (
INSERT INTO tbl(some_col)
SELECT y.some_col
FROM y
WHERE y.tbl_id IS NULL
RETURNING tbl_id
)
SELECT COALESCE(
(SELECT tbl_id FROM z)
,(SELECT tbl_id FROM y)
);
CTE x is only for convenience: enter values once.
CTE y retrieves tbl_id - if it already exists.
CTE z inserts the new row - if it doesn't.
The final SELECT avoids running another query on the table with the COALESCE construct.
Now, this can still fail if a concurrent transaction commits a new row with some_col = 'foo' exactly between CTE y and z, but that's extremely unlikely. If it happens you get a duplicate key violation and have to retry. Nothing lost. If you don't face concurrent writes, you can just forget about this.
You can put this into a plpgsql function and rerun the query on duplicate key error automatically.
Goes without saying that you need two indexes in this setup (like displayed in my CREATE TABLE statement above):
a UNIQUE or PRIMARY KEY constraint on tbl_id (which is of serial type!)
another UNIQUE or PRIMARY KEY constraint on some_col
Both implement an index automatically.

ordered SQL update

Given a table like this, where x and y have a unique constraint.
id,x,y
1, 1,1
2, 1,2
3, 2,3
4, 3,5
..
I want to increase the value x and y in a set of rows by a fixed amount with the UPDATE statement. Suppose I'm increasing them both by 1, it seems UPDATE follows id order and gives me an error after updating the row with 1 2, since it collides with the next row, 2 3, which hasn't been updated to 3, 4 yet.
Googling around, I can't find a way to force UPDATE to use an order. To do it in reverse would be enough for my application. I am also sure the set is consistent after the whole update.
Any solutions? Some way to force an order into the update, or any way to make it postpone the constraint check until it's finished?
This is meant for a Django application, and it's meant to be compatible with all supported databases. I know some databases have atomic transactions and this problem won't happen, some have features to avoid this problem, but I need a strictly standard SQL solution.
For PostgreSQL you can define the primary key constraint as "deferrable" and then it would be only evaluated at commit time.
In PostgreSQL this would look like this:
postgres=>create table foo (
postgres(> id integer not null,
postgres(> x integer,
postgres(> y integer
postgres(>);
CREATE TABLE
postgres=>alter table foo add constraint pk_foo primary key (id) deferrable initially deferred;
ALTER TABLE
postgres=> insert into foo (id, x,y) values (1,1,1), (2,1,1), (3,1,1);
INSERT 0 3
postgres=> commit;
COMMIT
postgres=> update foo set id = id + 1;
UPDATE 3
postgres=> commit;
COMMIT
postgres=> select * from foo;
id | x | y
----+---+---
2 | 1 | 1
3 | 1 | 1
4 | 1 | 1
(3 rows)
postgres=>
For Oracle this is not necessary as it will evaluate the UPDATE statement as a single atomic operation, so that works out of the box there.
For the sake of reference, MS SQL Server will not present a problem with this either. UPDATE is a single atomic operation.
ALTER TABLE [table_name]
ADD CONSTRAINT unique_constraint
UNIQUE(x)
ALTER TABLE [table_name]
ADD CONSTRAINT unique_constraint2
UNIQUE(y)
update [table_name]
set x = x+1,
y = y+1
Should pose no problem at all.

Incrementing with one query a set of values in a field with UNIQUE constraint, Postgres

I have a table in which I have a numeric field A, which is set to be UNIQUE. This field is used to indicate an order in which some action has to be performed. I want to make an UPDATE of all the values that are greater, for example, than 3. For example,
I have
A
1
2
3
4
5
Now, I want to add 1 to all values of A greater than 3. So, the result would be
A
1
2
3
5
6
The question is, whether it is possible to be done using only one query? Remember that I have a UNIQUE constraint on the column A.
Obviously, I tried
UPDATE my_table SET A = A + 1 WHERE A > 3;
but it did not work as I have the constraint on this field.
PostgreSQL 9.0 and later
PostgreSQL 9.0 added deferrable unique constraints, which is exactly the feature you seem to need. This way, uniqueness is checked at commit-time rather than update-time.
Create the UNIQUE constraint with the DEFERRABLE keyword:
ALTER TABLE foo ADD CONSTRAINT foo_uniq (foo_id) DEFERRABLE;
Later, before running the UPDATE statement, you run in the same transaction:
SET CONSTRAINTS foo_uniq DEFERRED;
Alternatively you can create the constraint with the INITIALLY DEFERRED keyword on the unique constraint itself -- so you don't have to run SET CONSTRAINTS -- but this might affect the performance of your other queries which don't need to defer the constraint.
PostgreSQL 8.4 and older
If you only want to use the unique constraint for guaranteeing uniqueness -- not as a target for a foreign key -- then this workaround might help:
First, add a boolean column such as is_temporary to the table that temporarily distinguishes updated and non-updated rows:
CREATE TABLE foo (value int not null, is_temporary bool not null default false);
Next create a partial unique index that only affects rows where is_temporary=false:
CREATE UNIQUE INDEX ON foo (value) WHERE is_temporary=false;
Now, every time do make the updates you described, you run them in two steps:
UPDATE foo SET is_temporary=true, value=value+1 WHERE value>3;
UPDATE foo SET is_temporary=false WHERE is_temporary=true;
As long as these statements occur in a single transaction, this will be totally safe -- other sessions will never see the temporary rows. The downside is that you'll be writing the rows twice.
Do note that this is merely a unique index, not a constraint, but in practice it shouldn't matter.
You can do it in 2 queries with a simple trick :
First, update your column with +1, but add a with a x(-1) factor :
update my_table set A=(A+1)*-1 where A > 3.
You will swtich from 4,5,6 to -5,-6,-7
Second, convert back the operation to restore positive :
update my_table set A=(A)*-1 where A < 0.
You will have : 5,6,7
You can do this with a loop. I don't like this solution, but it works:
CREATE TABLE update_unique (id INT NOT NULL);
CREATE UNIQUE INDEX ux_id ON update_unique (id);
INSERT INTO update_unique(id) SELECT a FROM generate_series(1,100) AS foo(a);
DO $$
DECLARE v INT;
BEGIN
FOR v IN SELECT id FROM update_unique WHERE id > 3 ORDER BY id DESC
LOOP
UPDATE update_unique SET id = id + 1 WHERE id = v;
END LOOP;
END;
$$ LANGUAGE 'plpgsql';
In PostgreSQL 8.4, you might have to create a function to do this since you can't arbitrarily run PL/PGSQL from a prompt using DO (at least not to the best of my recollection).

Swap unique indexed column values in database

I have a database table and one of the fields (not the primary key) is having a unique index on it. Now I want to swap values under this column for two rows. How could this be done? Two hacks I know are:
Delete both rows and re-insert them.
Update rows with some other value
and swap and then update to actual value.
But I don't want to go for these as they do not seem to be the appropriate solution to the problem.
Could anyone help me out?
The magic word is DEFERRABLE here:
DROP TABLE ztable CASCADE;
CREATE TABLE ztable
( id integer NOT NULL PRIMARY KEY
, payload varchar
);
INSERT INTO ztable(id,payload) VALUES (1,'one' ), (2,'two' ), (3,'three' );
SELECT * FROM ztable;
-- This works, because there is no constraint
UPDATE ztable t1
SET payload=t2.payload
FROM ztable t2
WHERE t1.id IN (2,3)
AND t2.id IN (2,3)
AND t1.id <> t2.id
;
SELECT * FROM ztable;
ALTER TABLE ztable ADD CONSTRAINT OMG_WTF UNIQUE (payload)
DEFERRABLE INITIALLY DEFERRED
;
-- This should also work, because the constraint
-- is deferred until "commit time"
UPDATE ztable t1
SET payload=t2.payload
FROM ztable t2
WHERE t1.id IN (2,3)
AND t2.id IN (2,3)
AND t1.id <> t2.id
;
SELECT * FROM ztable;
RESULT:
DROP TABLE
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "ztable_pkey" for table "ztable"
CREATE TABLE
INSERT 0 3
id | payload
----+---------
1 | one
2 | two
3 | three
(3 rows)
UPDATE 2
id | payload
----+---------
1 | one
2 | three
3 | two
(3 rows)
NOTICE: ALTER TABLE / ADD UNIQUE will create implicit index "omg_wtf" for table "ztable"
ALTER TABLE
UPDATE 2
id | payload
----+---------
1 | one
2 | two
3 | three
(3 rows)
I think you should go for solution 2. There is no 'swap' function in any SQL variant I know of.
If you need to do this regularly, I suggest solution 1, depending on how other parts of the software are using this data. You can have locking issues if you're not careful.
But in short: there is no other solution than the ones you provided.
Further to Andy Irving's answer
this worked for me (on SQL Server 2005) in a similar situation
where I have a composite key and I need to swap a field which is part of the unique constraint.
key: pID, LNUM
rec1: 10, 0
rec2: 10, 1
rec3: 10, 2
and I need to swap LNUM so that the result is
key: pID, LNUM
rec1: 10, 1
rec2: 10, 2
rec3: 10, 0
the SQL needed:
UPDATE DOCDATA
SET LNUM = CASE LNUM
WHEN 0 THEN 1
WHEN 1 THEN 2
WHEN 2 THEN 0
END
WHERE (pID = 10)
AND (LNUM IN (0, 1, 2))
There is another approach that works with SQL Server: use a temp table join to it in your UPDATE statement.
The problem is caused by having two rows with the same value at the same time, but if you update both rows at once (to their new, unique values), there is no constraint violation.
Pseudo-code:
-- setup initial data values:
insert into data_table(id, name) values(1, 'A')
insert into data_table(id, name) values(2, 'B')
-- create temp table that matches live table
select top 0 * into #tmp_data_table from data_table
-- insert records to be swapped
insert into #tmp_data_table(id, name) values(1, 'B')
insert into #tmp_data_table(id, name) values(2, 'A')
-- update both rows at once! No index violations!
update data_table set name = #tmp_data_table.name
from data_table join #tmp_data_table on (data_table.id = #tmp_data_table.id)
Thanks to Rich H for this technique.
- Mark
Assuming you know the PK of the two rows you want to update... This works in SQL Server, can't speak for other products. SQL is (supposed to be) atomic at the statement level:
CREATE TABLE testing
(
cola int NOT NULL,
colb CHAR(1) NOT NULL
);
CREATE UNIQUE INDEX UIX_testing_a ON testing(colb);
INSERT INTO testing VALUES (1, 'b');
INSERT INTO testing VALUES (2, 'a');
SELECT * FROM testing;
UPDATE testing
SET colb = CASE cola WHEN 1 THEN 'a'
WHEN 2 THEN 'b'
END
WHERE cola IN (1,2);
SELECT * FROM testing;
so you will go from:
cola colb
------------
1 b
2 a
to:
cola colb
------------
1 a
2 b
I also think that #2 is the best bet, though I would be sure to wrap it in a transaction in case something goes wrong mid-update.
An alternative (since you asked) to updating the Unique Index values with different values would be to update all of the other values in the rows to that of the other row. Doing this means that you could leave the Unique Index values alone, and in the end, you end up with the data that you want. Be careful though, in case some other table references this table in a Foreign Key relationship, that all of the relationships in the DB remain intact.
I have the same problem. Here's my proposed approach in PostgreSQL. In my case, my unique index is a sequence value, defining an explicit user-order on my rows. The user will shuffle rows around in a web-app, then submit the changes.
I'm planning to add a "before" trigger. In that trigger, whenever my unique index value is updated, I will look to see if any other row already holds my new value. If so, I will give them my old value, and effectively steal the value off them.
I'm hoping that PostgreSQL will allow me to do this shuffle in the before trigger.
I'll post back and let you know my mileage.
In SQL Server, the MERGE statement can update rows that would normally break a UNIQUE KEY/INDEX. (Just tested this because I was curious.)
However, you'd have to use a temp table/variable to supply MERGE w/ the necessary rows.
For Oracle there is an option, DEFERRED, but you have to add it to your constraint.
SET CONSTRAINT emp_no_fk_par DEFERRED;
To defer ALL constraints that are deferrable during the entire session, you can use the ALTER SESSION SET constraints=DEFERRED statement.
Source
I usually think of a value that absolutely no index in my table could have. Usually - for unique column values - it's really easy. For example, for values of column 'position' (information about the order of several elements) it's 0.
Then you can copy value A to a variable, update it with value B and then set value B from your variable. Two queries, I know no better solution though.
Oracle has deferred integrity checking which solves exactly this, but it is not available in either SQL Server or MySQL.
1) switch the ids for name
id student
1 Abbot
2 Doris
3 Emerson
4 Green
5 Jeames
For the sample input, the output is:
id student
1 Doris
2 Abbot
3 Green
4 Emerson
5 Jeames
"in case n number of rows how will manage......"