How to insert a row if not exists otherwise select and return its ID in both cases in MariaDB? - sql

I have a table with ID primary key (autoincrement) and a unique column Name. Is there an efficient way in MariaDB to insert a row into this table if the same Name doesn't exist, otherwise select the existing row and, in both cases, return the ID of the row with this Name?
Here's a solution for Postgres. However, it seems MariaDB doesn't have the RETURNING id clause.
What I have tried so far is brute-force:
INSERT IGNORE INTO services (Name) VALUES ('JohnDoe');
SELECT ID FROM services WHERE Name='JohnDoe';
UPDATE: MariaDB 10.5 has RETURNING clause, however, the queries I have tried so far throw a syntax error:
WITH i AS (INSERT IGNORE INTO services (`Name`) VALUES ('John') RETURNING ID)
SELECT ID FROM i
UNION
SELECT ID FROM services WHERE `Name`='John'

For a single row, assuming id is AUTO_INCREMENT.
INSERT INTO t (name)
VALUES ('JohnDoe')
ON DUPLICATE KEY id = LAST_INSERT_ID(id);
SELECT LAST_INSERT_ID();
That looks kludgy, but it is an example in the documentation.
Caution: Most forms of INSERT will "burn" auto_inc ids. That is, they grab the next id(s) before realizing that the id won't be used. This could lead to overflowing the max auto_inc size.
It is also wise not to put the normalization inside the transaction that does the "meat" of the code. It ties up the table unnecessarily long and runs extra risk of burning ids in the case of rollback.
For batch updating of a 'normalization' table like that, see my notes here: http://mysql.rjweb.org/doc.php/staging_table#normalization (It avoids burning ids.)

Related

Postgres: SELECT or INSERT in high concurrent write load DB

We have a DB for which we need a "selsert" (not upsert) function.
The function should take a text value and return a id column of existing row (SELECT) or insert the value and return id of new row (INSERT).
There are multiple processes that will need to perform this functionality (selsert)
I have been experimenting with pg_advisory_lock and ON CONFLICT clause for INSERT but am still not sure what approach would work best (even when looking at some of the other answers).
So far I have come up with following
WITH
selected AS (
SELECT id FROM test.body_parts WHERE (lower(trim(part))) = lower(trim('finger')) LIMIT 1
),
inserted AS (
INSERT INTO test.body_parts (part)
SELECT trim('finger')
WHERE NOT EXISTS ( SELECT * FROM selected )
-- ON CONFLICT (lower(trim(part))) DO NOTHING -- not sure if this is needed
RETURNING id
)
SELECT id, 'inserted' FROM inserted
UNION
SELECT id, 'selected' FROM selected
Will above query (within function) insure consistency in high
concurrency write workloads?
Are there any other issues I must consider (locking?, etc, etc)
BTW, I can insure that there are no duplicate values of (part) by creating unique index. That is not an issue. What I am after is that SELECT returns existing value if another process does INSERT (I hope I am explaining this right)
Unique index would have following definition
CREATE UNIQUE INDEX body_parts_part_ux
ON test.body_parts
USING btree
(lower(trim(part)));

Is there an elegant way to deal with inserting rows with duplicate identifiers?

I'm trying to insert some rows into my table that have the same unique identifier, but all the other fields are different (the rows represent points on a map, and they just happen to have the same name). The final result I'd like to end up with is to somehow modify the offending rows to have unique identifiers (adding on some incrementing number to the identifier, like "name0", "name1", "name2", etc.) during the insertion command.
I'm aware of Postgres's recent addition of "ON CONFLICT" support, but it's not quite what I'm looking for.
According to the Postgres 9.6 Documentation:
The optional ON CONFLICT clause specifies an alternative action to raising a unique violation or exclusion constraint violation error. For each individual row proposed for insertion, either the insertion proceeds, or ... the alternative conflict_action is taken. ...ON CONFLICT DO UPDATE updates the existing row that conflicts with the row proposed for insertion as its alternative action.
What I would like to do is 1) either modify the offending row or the insertion itself and 2) proceed with the insertion (instead of replacing it with an update, like the ON CONFLICT feature does). Is there an elegant way of accomplishing this? Or am I going to need to write something more complex?
You can do this:
create table my_table
(
name text primary key,
some_column varchar
);
create sequence my_table_seq;
The sequence is used to assign a unique suffix to the new row's PK column.
The "insert on conflict insert modified" behaviour can be done like this:
with data (name, some_column) as (
values ('foo', 'bar')
), inserted as (
insert into my_table
select *
from data
on conflict (name) do nothing
returning *
)
insert into my_table (name, some_column)
select concat(name, '_', nextval('my_table_seq')), some_column
from data
where not exists (select 1 from inserted);
The first time you insert a value into the PK column, the insert (in the CTE "inserted") just proceeds. The final insert won't insert anything because the where not exists () prevents that as the inserted returned one row.
The second time you run this, the first insert won't insert anything, and thus the second (final) insert will.
There is one drawback though: if something was inserted by the "inserted" CTE, the the overall statement will report "0 rows affected" because the final insert is the one "driving" this information.

Return rows from INSERT with ON CONFLICT without needing to update

I have a situation where I very frequently need to get a row from a table with a unique constraint, and if none exists then create it and return.
For example my table might be:
CREATE TABLE names(
id SERIAL PRIMARY KEY,
name TEXT,
CONSTRAINT names_name_key UNIQUE (name)
);
And it contains:
id | name
1 | bob
2 | alice
Then I'd like to:
INSERT INTO names(name) VALUES ('bob')
ON CONFLICT DO NOTHING RETURNING id;
Or perhaps:
INSERT INTO names(name) VALUES ('bob')
ON CONFLICT (name) DO NOTHING RETURNING id
and have it return bob's id 1. However, RETURNING only returns either inserted or updated rows. So, in the above example, it wouldn't return anything. In order to have it function as desired I would actually need to:
INSERT INTO names(name) VALUES ('bob')
ON CONFLICT ON CONSTRAINT names_name_key DO UPDATE
SET name = 'bob'
RETURNING id;
which seems kind of cumbersome. I guess my questions are:
What is the reasoning for not allowing the (my) desired behaviour?
Is there a more elegant way to do this?
It's the recurring problem of SELECT or INSERT, related to (but different from) an UPSERT. The new UPSERT functionality in Postgres 9.5 is still instrumental.
WITH ins AS (
INSERT INTO names(name)
VALUES ('bob')
ON CONFLICT ON CONSTRAINT names_name_key DO UPDATE
SET name = NULL
WHERE FALSE -- never executed, but locks the row
RETURNING id
)
SELECT id FROM ins
UNION ALL
SELECT id FROM names
WHERE name = 'bob' -- only executed if no INSERT
LIMIT 1;
This way you do not actually write a new row version without need.
I assume you are aware that in Postgres every UPDATE writes a new version of the row due to its MVCC model - even if name is set to the same value as before. This would make the operation more expensive, add to possible concurrency issues / lock contention in certain situations and bloat the table additionally.
However, there is still a tiny corner case for a race condition. Concurrent transactions may have added a conflicting row, which is not yet visible in the same statement. Then INSERT and SELECT come up empty.
Proper solution for single-row UPSERT:
Is SELECT or INSERT in a function prone to race conditions?
General solutions for bulk UPSERT:
How to use RETURNING with ON CONFLICT in PostgreSQL?
Without concurrent write load
If concurrent writes (from a different session) are not possible you don't need to lock the row and can simplify:
WITH ins AS (
INSERT INTO names(name)
VALUES ('bob')
ON CONFLICT ON CONSTRAINT names_name_key DO NOTHING -- no lock needed
RETURNING id
)
SELECT id FROM ins
UNION ALL
SELECT id FROM names
WHERE name = 'bob' -- only executed if no INSERT
LIMIT 1;

Get Id from a conditional INSERT

For a table like this one:
CREATE TABLE Users(
id SERIAL PRIMARY KEY,
name TEXT UNIQUE
);
What would be the correct one-query insert for the following operation:
Given a user name, insert a new record and return the new id. But if the name already exists, just return the id.
I am aware of the new syntax within PostgreSQL 9.5 for ON CONFLICT(column) DO UPDATE/NOTHING, but I can't figure out how, if at all, it can help, given that I need the id to be returned.
It seems that RETURNING id and ON CONFLICT do not belong together.
The UPSERT implementation is hugely complex to be safe against concurrent write access. Take a look at this Postgres Wiki that served as log during initial development. The Postgres hackers decided not to include "excluded" rows in the RETURNING clause for the first release in Postgres 9.5. They might build something in for the next release.
This is the crucial statement in the manual to explain your situation:
The syntax of the RETURNING list is identical to that of the output
list of SELECT. Only rows that were successfully inserted or updated
will be returned. For example, if a row was locked but not updated
because an ON CONFLICT DO UPDATE ... WHERE clause condition was not
satisfied, the row will not be returned.
Bold emphasis mine.
For a single row to insert:
Without concurrent write load on the same table
WITH ins AS (
INSERT INTO users(name)
VALUES ('new_usr_name') -- input value
ON CONFLICT(name) DO NOTHING
RETURNING users.id
)
SELECT id FROM ins
UNION ALL
SELECT id FROM users -- 2nd SELECT never executed if INSERT successful
WHERE name = 'new_usr_name' -- input value a 2nd time
LIMIT 1;
With possible concurrent write load on the table
Consider this instead (for single row INSERT):
Is SELECT or INSERT in a function prone to race conditions?
To insert a set of rows:
How to use RETURNING with ON CONFLICT in PostgreSQL?
How to include excluded rows in RETURNING from INSERT ... ON CONFLICT
All three with very detailed explanation.
For a single row insert and no update:
with i as (
insert into users (name)
select 'the name'
where not exists (
select 1
from users
where name = 'the name'
)
returning id
)
select id
from users
where name = 'the name'
union all
select id from i
The manual about the primary and the with subqueries parts:
The primary query and the WITH queries are all (notionally) executed at the same time
Although that sounds to me "same snapshot" I'm not sure since I don't know what notionally means in that context.
But there is also:
The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot
If I understand correctly that same snapshot bit prevents a race condition. But again I'm not sure if by all the statements it refers only to the statements in the with subqueries excluding the main query. To avoid any doubt move the select in the previous query to a with subquery:
with s as (
select id
from users
where name = 'the name'
), i as (
insert into users (name)
select 'the name'
where not exists (select 1 from s)
returning id
)
select id from s
union all
select id from i

Sqlite - SELECT or INSERT and SELECT in one statement

I'm trying to avoid writing separate SQL queries to achieve the following scenario:
I have a Table called Values:
Values:
id INT (PK)
data TEXT
I would like to check if certain data exists in the table, and if it does, return its id, otherwise insert it and return its id.
The (very) naive way would be:
select id from Values where data = "SOME_DATA";
if id is not null, take it.
if id is null then:
insert into Values(data) values("SOME_DATA");
and then select it again to see its id or use the returned id.
I am trying to make the above functionality in one line.
I think I'm getting close, but I couldn't make it yet:
So far I got this:
select id from Values where data=(COALESCE((select data from Values where data="SOME_DATA"), (insert into Values(data) values("SOME_DATA"));
I'm trying to take advantage of the fact that the second select will return null and then the second argument to COALESCE will be returned. No success so far. What am I missing?
Your command does not work because in SQL, INSERT does not return a value.
If you have a unique constraint/index on the data column, you can use that to prevent duplicates if you blindly insert the value; this uses SQLite's INSERT OR IGNORE extension:
INSERT OR IGNORE INTO "Values"(data) VALUES('SOME_DATE');
SELECT id FROM "Values" WHERE data = 'SOME_DATA';