Modifying json_populate_record before insertion - sql

Lets say I have som data that looks like
create table test.from
(
id integer primary key,
data json
);
insert into test.from
values
(4, '{ "some_field": 11 }'),
(9, '{ "some_field": 22 }');
create table test.to
(
id integer primary key,
some_field int
);
I would like the rows in the "to" table to have the same id key as the "from" row, and expand the json into separate columns. But using json_populate_record like below, will unsurprisingly give me null as key.
Method 1:
insert into test.to
select l.*
from test.from fr
cross join lateral json_populate_record(null::test.to, fr.data) l;
I can achieve what I'm looking for by naming columns like below
Method 2:
insert into test.to (id, some_field)
select
fr.id as id,
l.some_field
from test.from fr
cross join lateral json_populate_record(null::test.to, fr.data) l;
The challenge is that I want to avoid naming any columns other than the id column, both since it gets tedious, but also since I'd like to do this in a function where the column names are not known.
What modifications do I have to do to Method 1 to update the record with the correct id?

Just append the id key to your data like this:
insert into test.to
select l.*
from
test.from fr
cross join lateral jsonb_populate_record(
null::test.to,
fr.data::jsonb || jsonb_build_object('id', fr.id)) l;

Related

How can I stop my Postgres recusive CTE from indefinitely looping?

Background
I'm running Postgres 11 on CentOS 7.
I recently learned the basics of recursive CTEs in Postgres thanks to S-Man's answer to my recent question.
The problem
While working on a closely related issue (counting parts sold within bundles and assemblies) and using this recursive CTE, I ran into a problem where the query looped indefinitely and never completed.
I tracked this down to the presence of non-spurious 'self-referential' entries in the relator table, i.e. rows with the same value for parent_name and child_name.
I know that these are the source of the problem because when I recreated the situation with test tables and data, the undesired looping behavior occurred when these rows were present, and disappeared when these rows were absent or when UNION (which excludes duplicate returned rows) was used in the CTE rather than UNION ALL .
I think the data model itself probably needs adjusting so that these 'self-referential' rows aren't necessary, but for now, what I need to do is get this query to return the desired data on completion and stop looping.
How can I achieve this result? All guidance much appreciated!
Tables and test data
CREATE TABLE the_schema.names_categories (
id INTEGER NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
created_at TIMESTAMPTZ DEFAULT now(),
thing_name TEXT NOT NULL,
thing_category TEXT NOT NULL
);
CREATE TABLE the_schema.relator (
id INTEGER NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
created_at TIMESTAMPTZ DEFAULT now(),
parent_name TEXT NOT NULL,
child_name TEXT NOT NULL,
child_quantity INTEGER NOT NULL
);
/* NOTE: listing_name below is like an alias of a relator.parent_name as it appears in a catalog,
required to know because it is these listing_names that are reflected by sales.sold_name */
CREATE TABLE the_schema.catalog_listings (
id INTEGER NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
created_at TIMESTAMPTZ DEFAULT now(),
listing_name TEXT NOT NULL,
parent_name TEXT NOT NULL
);
CREATE TABLE the_schema.sales (
id INTEGER NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
created_at TIMESTAMPTZ DEFAULT now(),
sold_name TEXT NOT NULL,
sold_quantity INTEGER NOT NULL
);
CREATE VIEW the_schema.relationships_with_child_category AS (
SELECT
c.listing_name,
r.parent_name,
r.child_name,
r.child_quantity,
n.thing_category AS child_category
FROM
the_schema.catalog_listings c
INNER JOIN
the_schema.relator r
ON c.parent_name = r.parent_name
INNER JOIN
the_schema.names_categories n
ON r.child_name = n.thing_name
);
INSERT INTO the_schema.names_categories (thing_name, thing_category)
VALUES ('parent1', 'bundle'), ('child1', 'assembly'), ('child2', 'assembly'),('subChild1', 'component'),
('subChild2', 'component'), ('subChild3', 'component');
INSERT INTO the_schema.catalog_listings (listing_name, parent_name)
VALUES ('listing1', 'parent1'), ('parent1', 'child1'), ('parent1','child2'), ('child1', 'child1'), ('child2', 'child2');
INSERT INTO the_schema.catalog_listings (listing_name, parent_name)
VALUES ('parent1', 'child1'), ('parent1','child2');
/* note the two 'self-referential' entries */
INSERT INTO the_schema.relator (parent_name, child_name, child_quantity)
VALUES ('parent1', 'child1', 1),('child1', 'subChild1', 1), ('child1', 'subChild2', 1)
('parent1', 'child2', 1),('child2', 'subChild1', 1), ('child2', 'subChild3', 1), ('child1', 'child1', 1), ('child2', 'child2', 1);
INSERT INTO the_schema.sales (sold_name, sold_quantity)
VALUES ('parent1', 1), ('parent1', 2), ('listing1', 1);
The present query, loops indefinitely with the required UNION ALL
WITH RECURSIVE cte AS (
SELECT
s.sold_name,
s.sold_quantity,
r.child_name,
r.child_quantity,
r.child_category as category
FROM
the_schema.sales s
JOIN the_schema.relationships_with_child_category r
ON s.sold_name = r.listing_name
UNION ALL
SELECT
cte.sold_name,
cte.sold_quantity,
r.child_name,
r.child_quantity,
r.child_category
FROM cte
JOIN the_schema.relationships_with_child_category r
ON cte.child_name = r.parent_name
)
SELECT
child_name,
SUM(sold_quantity * child_quantity)
FROM cte
WHERE category = 'component'
GROUP BY child_name
;
In catalog_listings table listing_name and parent_name is same for child1 and child2
In relator table parent_name and child_name is also same for child1 and child2
These rows are creating cycling recursion.
Just remove those two rows from both the tables:
delete from catalog_listings where id in (4,5)
delete from relator where id in (7,8)
Then your desired output will be as below:
child_name
sum
subChild2
8
subChild3
8
subChild1
16
Is this the result you are looking for?
If you can't delete the rows you can use below add parent_name<>child_name condition to avoid those rows:
WITH RECURSIVE cte AS (
SELECT
s.sold_name,
s.sold_quantity,
r.child_name,
r.child_quantity,
r.child_category as category
FROM
the_schema.sales s
JOIN the_schema.relationships_with_child_category r
ON s.sold_name = r.listing_name and r.parent_name <>r.child_name
UNION ALL
SELECT
cte.sold_name,
cte.sold_quantity,
r.child_name,
r.child_quantity,
r.child_category
FROM cte
JOIN the_schema.relationships_with_child_category r
ON cte.child_name = r.parent_name and r.parent_name <>r.child_name
)
SELECT
child_name,
SUM(sold_quantity * child_quantity)
FROM cte
WHERE category = 'component'
GROUP BY child_name ;
You may be able to avoid infinite recursion simply by using UNION instead of UNION ALL.
The documentation describes the implementation:
Evaluate the non-recursive term. For UNION (but not UNION ALL), discard duplicate rows. Include all remaining rows in the result of the recursive query, and also place them in a temporary working table.
So long as the working table is not empty, repeat these steps:
Evaluate the recursive term, substituting the current contents of the working table for the recursive self-reference. For UNION (but not UNION ALL), discard duplicate rows and rows that duplicate any previous result row. Include all remaining rows in the result of the recursive query, and also place them in a temporary intermediate table.
Replace the contents of the working table with the contents of the intermediate table, then empty the intermediate table.
"Getting rid of the duplicates" should cause the intermediate table to be empty at some point, which ends the iteration.

Creating a key value pair table

I have a table with 32 columns whose primary key is play_id and my goal is to create a key value pair table where the output would look like this:
play_id, field_name, field_value
play_id_1, game_date, ‘2020-07-23’
play_id_1, possession, ‘home’
' ', ' ', ' '
play_id_2, game_date, ‘2020-07-24’
play_id_2, possession, ‘away’
where all the entries in field_name are columns from the original table (besides play_id). So there would be 31 appearances of play_id_1 in this new table. I want the new primary key for this table to be (play_id, field_name). There's a long way of doing it where I can query for play_id and each column and union everything together, but I want to find a more elegant solution.
You can use a lateral join for that. But you will need to cast all values to text.
select t.play_id, x.*
from the_table t
cross join lateral (
values
('game_date', game_date::text),
('possession', possession::text),
('col_3', col_3::text),
...
) as x(field_name, field_value);
Another option is to convert the row to a JSON value and then unnest that json:
select t.play_id, x.*
from the_table t
cross join lateral jsonb_each(to_jsonb(t) - 'play_id') as x(field_name, field_value);

Select Records With Multiple Specific Ids

I'm trying to make a SQL query where I can find all ids of a member (member_ids) with all branches.
This all branches is an array consisting of all branch ids. So for example the values of all branches is [1,2,3,4,5,6,7,8,9,10].
I want to query similar to this one:
SELECT member_id FROM member_branch WHERE branch_id IN [1,2,3,4,5,6,7,8,9,10]
But to query all member_id having all branch_id. Because if a member_id has only [1,2,3], it will be included also in the query, and I don't want.
Try this,
SELECT member_id FROM member_branch where branch_id IN (1,2,3,4);
Will this not work.
declare #branchCount int =10
select member_id from (SELECT member_id, branch_id FROM member_branch WHERE branch_id IN ('1','2','3','4','5','6','7','8','9','10')) a group by a.member_id having count(a.member_id)>=#branchCount
If I understand the question correctly and based on the format of the data in the branch_id column ( which is a valid JSON array), you may try to use a JSON-based approach. When you parse a JSON array ([1,2,3,4,5,6,7,8,9,10] for example) using OPENJSON() and default schema, the result is a table with columns key, value and type. After that, you may use any set-base approach (EXCEPT and OUTER APPLY in the example).
Table:
CREATE TABLE member_branch (
member_id int,
branch_id nvarchar(100)
)
INSERT INTO member_branch
(member_id, branch_id)
VALUES
(1, '[1,2,3,4,5,6,7,8,9,10]'),
(2, '[1,2,3,4]'),
(3, '[10]'),
(4, '[1,2,3,4,5,6,7,8,9,10]')
DECLARE #branches nvarchar(100) = N'[1,2,3,4,5,6,7,8,9,10]'
Statement:
SELECT m.member_id
FROM member_branch m
OUTER APPLY (
SELECT [value] FROM OPENJSON(#branches)
EXCEPT
SELECT [value] FROM OPENJSON(m.branch_id)
) j
WHERE j.[value] IS NULL
Result:
---------
member_id
---------
1
4

Adding a LEFT JOIN on a INSERT INTO....RETURNING

My query Inserts a value and returns the new row inserted
INSERT INTO
event_comments(date_posted, e_id, created_by, parent_id, body, num_likes, thread_id)
VALUES(1575770277, 1, '9e028aaa-d265-4e27-9528-30858ed8c13d', 9, 'December 7th', 0, 'zRfs2I')
RETURNING comment_id, date_posted, e_id, created_by, parent_id, body, num_likes, thread_id
I want to join the created_by with the user_id from my user's table.
SELECT * from users WHERE user_id = created_by
Is it possible to join that new returning row with another table row?
Consider using a WITH structure to pass the data from the insert to a query that can then be joined.
Example:
-- Setup some initial tables
create table colors (
id SERIAL primary key,
color VARCHAR UNIQUE
);
create table animals (
id SERIAL primary key,
a_id INTEGER references colors(id),
animal VARCHAR UNIQUE
);
-- provide some initial data in colors
insert into colors (color) values ('red'), ('green'), ('blue');
-- Store returned data in inserted_animal for use in next query
with inserted_animal as (
-- Insert a new record into animals
insert into animals (a_id, animal) values (3, 'fish') returning *
) select * from inserted_animal
left join colors on inserted_animal.a_id = colors.id;
-- Output
-- id | a_id | animal | id | color
-- 1 | 3 | fish | 3 | blue
Explanation:
A WITH query allows a record returned from an initial query, including data returned from a RETURNING clause, which is stored in a temporary table that can be accessed in the expression that follows it to continue work on it, including using a JOIN expression.
You were right, I misunderstood
This should do it:
DECLARE mycreated_by event_comments.created_by%TYPE;
INSERT INTO
event_comments(date_posted, e_id, created_by, parent_id, body, num_likes, thread_id)
VALUES(1575770277, 1, '9e028aaa-d265-4e27-9528-30858ed8c13d', 9, 'December 7th', 0, 'zRfs2I')
RETURNING created_by into mycreated_by
SELECT * from users WHERE user_id = mycreated_by

How to insert columns with adding sequence column?

In an Oracle database (11gR2), I have a table my_table with columns (sequence, col1, col2, col3). I want to insert values into the table that are queried from other tables, i.e. insert into my_table select <query from other tables>. The problem is that the primary key is the four columns, hence I need to add a sequence starting from 0 up till the count of the rows to be inserted (order is not a problem).
I tried using a loop like this:
DECLARE
j NUMBER;
r_count number;
BEGIN
select count(1) into r_count from <my query to be inserted>;
FOR j IN 0 .. r_count
LOOP
INSERT INTO my_table
select <my query, incorporating r_count as sequence column> ;
END LOOP;
END;
But it didn't work, actually looped r_count times trying to insert the entire rows every time, as logically it shall do. How can I achieve the expected goal and insert rows with adding a sequence column?
Don't do this in a loop. Just use row_number():
INSERT INTO my_table(seq, . . .)
select row_number() over (order by NULL) - 1, . . .
from . . .;
Let's create table with sample data (to simulate your source of data)
-- This is your source query table (can be anything)
CREATE TABLE source_table
(
source_a VARCHAR(255),
source_b VARCHAR(255),
source_c VARCHAR(255)
);
insert into source_table (source_a, source_b, source_c) values ('A', 'B', 'C');
insert into source_table (source_a, source_b, source_c) values ('D', 'E', 'F');
insert into source_table (source_a, source_b, source_c) values ('G', 'H', 'I');
Then create target table, with id and 3 data columns.
-- This is your target_table
CREATE TABLE target_table
(
id NUMBER(9,0),
target_a VARCHAR2(255),
target_b VARCHAR2(255),
target_c VARCHAR2(255)
);
-- This is sequence used to ensure unique number in 1st column
CREATE sequence target_table_id_seq start with 0 minvalue 0 increment BY 1;
Finally, perform insert, loading id from sequence, rest of the data from source table.
INSERT INTO target_table
SELECT target_table_id_seq.nextval,
source_a,
source_b,
source_c
FROM source_table;
Results might look like
1 A B C
2 D E F
3 G H I
If you added some values later, they will continue with numbering 4,5,6 etc.. Or do you want to get order only inside the group ? Thus if you added 2 more rows JKL and MNO, target table would look like this
1 A B C
2 D E F
3 G H I
1 J K L
2 M N O
For that you need different solution (don't even need sequencer)
SELECT
RANK() OVER (ORDER BY source_a, source_b, source_c),
source_a,
source_b,
source_c
FROM source_table;
Technically you could use ROWNUM directly, BUT I opt for RANK() OVER analytical function due to consistent result. Please note, that this will breach your complex primary key if you try to insert the same rows twice (My first solution doesn't)
Clearly, you should use Oracle sequence.
First, create a sequence:
create sequence seq_my_table start with 0 minvalue 0 increment by 1;
Then use it:
INSERT INTO my_table (sequence, ...)
select seq_my_table.nextval, <the rest of my query>;
Sequence numbers will be inserted in succession.
So, you already have the table, it has the required number of rows, and now you want to add numbers from 0 to total number of rows minus one in the column named sequence? (perhaps not "sequence" but something less likely to clash with Oracle reserved words?)
Then this should work:
update my_table set seq = rownum - 1;