PostgreSQL nested INSERTs / WITHs for foreign key insertions - sql

I'm using PostgreSQL 9.3, and I'm trying to write a SQL script to insert some data for unit tests, and I've run into a bit of a problem.
Let's say we have three tables, structured like this:
------- Table A ------- -------- Table B -------- -------- Table C --------
id | serial NOT NULL id | serial NOT NULL id | serial NOT NULL
foo | character varying a_id | integer NOT NULL b_id | integer NOT NULL
bar | character varying baz | character varying
The columns B.a_id and C.b_id are foreign keys to the id column of tables A and B, respectively.
What I'm trying to do is to insert a row into each of these three tables with pure SQL, without having the ID's hard-coded into the SQL (making assumptions about the database before this script is run seems undesirable, since if those assumptions change I'll have to go back and re-compute the proper ID's for all of the test data).
Note that I do realize I could do this programatically, but in general writing pure SQL is way less verbose than writing program code to execute SQL, so it makes more sense for test suite data.
Anyway, here's the query I wrote which I figured would work:
WITH X AS (
WITH Y AS (
INSERT INTO A (foo)
VALUES ('abc')
RETURNING id
)
INSERT INTO B (a_id, bar)
SELECT id, 'def'
FROM Y
RETURNING id
)
INSERT INTO C (b_id, baz)
SELECT id, 'ghi'
FROM X;
However, this doesn't work, and results in PostgreSQL telling me:
ERROR: WITH clause containing a data-modifying statement must be at the top level
Is there a correct way to write this type of query in general, without hard-coding the ID values?
(You can find a fiddle here which contains this example.)

Don't nest the common table expressions, just write one after the other:
WITH Y AS (
INSERT INTO A (foo)
VALUES ('abc')
RETURNING id
), x as (
INSERT INTO B (a_id, bar)
SELECT id, 'def'
FROM Y
RETURNING id
)
INSERT INTO C (b_id, baz)
SELECT id, 'ghi'
FROM X;

Related

Will order by preserve?

create table source_table (id number);
insert into source_table values(3);
insert into source_table values(1);
insert into source_table values(2);
create table target_table (id number, seq_val number);
create sequence example_sequence;
insert into target_table
select id, example_sequence.nextval
from
> (select id from source_table ***order by id***);
Is it officially assured that for the id's with the lower values in source_table corresponding sequence's value will also be lower when inserting into the source_table? In other words, is it guaranteed that the sorting provided by order by clause will be preserved when inserting?
EDIT
The question is not: 'Are rows ordered in a table as such?' but rather 'Can we rely on the order by clause used in the subquery when inserting?'.
To even more closely illustrate this, the contents of the target table in the above example, after running the query like select * from target_table order by id would be:
ID | SEQ_VAL
1 1
2 2
3 3
Moreover, if i specified descending ordering when inserting like this:
insert into target_table
select id, example_sequence.nextval
from
> (select id from source_table ***order by id DESC***);
The output of the same query from above would be:
ID | SEQ_VAL
1 3
2 2
3 1
Of that I'm sure, I have tested it multiple times. My question is 'Can I always rely on this ordering?'
Tables in a relational database are not ordered, and any apparent ordering in the result set of a cursor which lacks an ORDER BY is an artifact of data storage, is not guaranteed, and later actions on the table may cause this apparent ordering to change. If you want the results of a cursor to be ordered in a particular manner you MUST use an ORDER BY.

Returning row if value exists, inserting otherwises

I have a table that is created like so:
CREATE TABLE IF NOT EXISTS my_table
p_id uuid DEFAULT uuid_generate_v4() NOT NULL,
my_column varchar(250) NOT NULL,
PRIMARY KEY (p_id),
CONSTRAINT unique_mycolumn UNIQUE (mycolumn);
What I would like to do is return p_id if a particular value matches whatever is in my_column, but if not, insert that particular value and return the newly generated value. I can "sort" of get around this by trying to from within my code calling the respective queries to query the value, and if that returns null, insert it, but that's a fairly big race condition and I'd like to do it within sql. I've seen a lot of similiar questions, but this is different given the unique constraint on that column.
EDIT: This is what I'm looking to accomplish: Say I have the following data:
1 | foo
2 | bar
If I query for "bar", I'd like to get 2 back. However, If I query for "baz", I'd like to have the table look like the following:
1 | foo
2 | bar
3 | baz
So I guess it would be better described as "return ID if the value exists, and insert it the value otherwise to create a new row".
You could try something like this:
with ins as (insert into my_table (my_column)
select 'baz' where not exists
(select 1 from my_table where my_column = 'baz')
returning p_id)
select p_id from ins
union all
select p_id from my_table where my_column = 'baz';
Note however that this can fail if two transactions try to insert the value 'baz' at the same time.

SQL Multiple Row Insert w/ multiple selects from different tables

I am trying to do a multiple insert based on values that I am pulling from a another table. Basically I need to give all existing users access to a service that previously had access to a different one. Table1 will take the data and run a job to do this.
INSERT INTO Table1 (id, serv_id, clnt_alias_id, serv_cat_rqst_stat)
SELECT
(SELECT Max(id) + 1
FROM Table1 ),
'33', --The new service id
clnt_alias_id,
'PI' --The code to let the job know to grant access
FROM TABLE2,
WHERE serv_id = '11' --The old service id
I am getting a Primary key constraint error on id.
Please help.
Thanks,
Colin
This query is impossible. The max(id) sub-select will evaluate only ONCE and return the same value for all rows in the parent query:
MariaDB [test]> create table foo (x int);
MariaDB [test]> insert into foo values (1), (2), (3);
MariaDB [test]> select *, (select max(x)+1 from foo) from foo;
+------+----------------------------+
| x | (select max(x)+1 from foo) |
+------+----------------------------+
| 1 | 4 |
| 2 | 4 |
| 3 | 4 |
+------+----------------------------+
3 rows in set (0.04 sec)
You will have to run your query multiple times, once for each record you're trying to copy. That way the max(id) will get the ID from the previous query.
Is there a requirement that Table1.id be incremental ints? If not, just add the clnt_alias_id to Max(id). This is a nasty workaround though, and you should really try to get that column's type changed to auto_increment, like Marc B suggested.

Factor (string) to Numeric in PostgreSQL

Similar to this, is it possible to convert a String field to Numeric in PostgreSQL. For instance,
create table test (name text);
insert into test (name) values ('amy');
insert into test (name) values ('bob');
insert into test (name) values ('bob');
insert into test (name) values ('celia');
and add a field that is
name | num
-------+-----
amy | 1
bob | 2
bob | 2
celia | 3
The most effective "hash"-function of all is a serial primary key - giving you a unique number like you wished for in the question.
I also deal with duplicates in this demo:
CREATE TEMP TABLE string (
string_id serial PRIMARY KEY
,string text NOT NULL UNIQUE -- no dupes
,ct int NOT NULL DEFAULT 1 -- count instead of dupe rows
);
Then you would enter new strings like this:
(Data-modifying CTE requires PostgreSQL 9.1 or later.)
WITH x AS (SELECT 'abc'::text AS nu)
, y AS (
UPDATE string s
SET ct = ct + 1
FROM x
WHERE s.string = x.nu
RETURNING TRUE
)
INSERT INTO string (string)
SELECT nu
FROM x
WHERE NOT EXISTS (SELECT 1 FROM y);
If the string nu already exists, the count (ct) is increased by 1. If not, a new row is inserted, starting with a count of 1.
The UNIQUE also adds an index on the column string.string automatically, which leads to optimal performance for this query.
Add additional logic (triggers ?) for UPDATE / DELETE to make this bullet-proof - if needed.
Note, there is a super-tiny race condition here, if two concurrent transactions try to add the same string at the same moment in time. To be absolutely sure, you could use SERIALIZABLE transactions. More info and links under this this related question.
Live demo at sqlfiddle.
How 'bout a hash, such as md5, of name?
create table test (name text, hash text);
-- later
update test set hash = md5(name);
If you need to convert that md5 text to a number: Hashing a String to a Numeric Value in PostgresSQL
If they are all single characters, you could do this:
ALTER TABLE test ADD COLUMN num int;
UPDATE test SET num = ascii(name);
Though that would only return the character for the first letter if the string was more than a single character.
The exact case shown in your request can be produced with the dense_rank window function:
regress=# SELECT name, dense_rank() OVER (ORDER BY name) FROM test;
name | dense_rank
-------+------------
amy | 1
bob | 2
bob | 2
celia | 3
(4 rows)
so if you were adding a number for each row, you'd be able to do something like:
ALTER TABLE test ADD COLUMN some_num integer;
WITH gen(gen_name, gen_num) AS
(SELECT name, dense_rank() OVER (ORDER BY name) FROM test GROUP BY name)
UPDATE test SET some_num = gen_num FROM gen WHERE name = gen_name;
ALTER TABLE test ALTER COLUMN some_num SET NOT NULL;
however I think it's much more sensible to use a hash or to assign generated keys. I'm just showing that your example can be achieved.
The biggest problem with this approach is that inserting new data is a pain. It's a ranking (like your example shows) so if you INSERT INTO test (name) VALUES ('billy'); then the ranking changes.

Search for element in array of composite types

Using PostgreSQL 9.0
I have the following table setup
CREATE TABLE person (age integer, last_name text, first_name text, address text);
CREATE TABLE my_people (mperson person[]);
INSERT INTO my_people VALUES(array[ROW(44, 'John', 'Smith', '1234 Test Blvd.')::person]);
Now, i want to be able to write a select statement that can search and compare values of my composite types inside my mperson array column.
Example:
SELECT * FROM my_people WHERE 20 > ANY( (mperson) .age);
However when trying to execute this query i get the following error:
ERROR: column notation .age applied to type person[], which is not a composite type
LINE 1: SELECT mperson FROM my_people WHERE 20 > ANY((mperson).age);
So, you can see i'm trying to test the values of the composite type inside my array.
I know, i'm not supposed to use arrays and composites in my tables, but this best suites our applications requirements.
Also, we have several nested composite arrays, so a generic solution that would allow me to search many levels would be appreciated.
The construction ANY in your case looks redundant. You can write the query that way:
SELECT * FROM my_people WHERE (mperson[1]).age < 20;
Of course, if you have multiple values in this array, that won't work, but you can't get the exact array element the other way neither.
Why do you need arrays at all? You can just write one element of type person per row.
Check also the excellent HStore module, which might better suit your generic needs.
Temporary test setup:
CREATE TEMP TABLE person (age integer, last_name text, first_name text
, address text);
CREATE TEMP TABLE my_people (mperson person[]);
-- test-data, demonstrating 3 different syntax styles:
INSERT INTO my_better_people (mperson)
VALUES
(array[(43, 'Stack', 'Over', '1234 Test Blvd.')::person])
,(array['(44,John,Smith,1234 Test Blvd.)'::person,
'(21,Maria,Smith,1234 Test Blvd.)'::person])
,('{"(33,John,Miller,12 Test Blvd.)",
"(22,Frank,Miller,12 Test Blvd.)",
"(11,Bodi,Miller,12 Test Blvd.)"}');
Call (almost the solution):
SELECT (p).*
FROM (
SELECT unnest(mperson) AS p
FROM my_people) x
WHERE (p).age > 33;
Returns:
age | last_name | first_name | address
-----+-----------+------------+-----------------
43 | Stack | Over | 1234 Test Blvd.
44 | John | Smith | 1234 Test Blvd.
key is the unnest() function, that's available in 9.0.
Your mistake in the example is that you forget about the ARRAY layer in between. unnest() returns one row per base element, then you can access the columns in the complex type as demonstrated.
Brave new world
IF you actually want a whole people instead of an individual that fits the criteria, I propose you add a primary key to the table and proceed as follows:
CREATE TEMP TABLE my_better_people (id serial, mperson person[]);
-- shortcut to populate the new world by emigration from the old world ;)
INSERT INTO my_better_people (mperson)
SELECT mperson FROM my_people;
Find individuals:
SELECT id, (p).*
FROM (
SELECT id, unnest(mperson) AS p
FROM my_better_people) x
WHERE (p).age > 20;
Find whole people (solution):
SELECT *
FROM my_better_people p
WHERE EXISTS (
SELECT 1
FROM (
SELECT id, unnest(mperson) AS p
FROM my_better_people
) x
WHERE (p).age > 20
AND x.id = p.id
);
You can do it without a primary key, but that would be foolish.