duplicating rows through multiple tables(new id/foreign key) with some column modifications - sql

The basic concepts is to duplicate rows in table1 where id between for example 100..10000,
modify some of the column data then insert with a new id:
Table2 referencing to table1.id with foreign key, table3 referencing to table2.id with foreign key
.... and tableX referencing to tableX-1.id with foreign key.
I also have the modificate some of the table2..tableX data.
I started to think about writing nested loops; for the first 3 table, it looks like this (in plsql), maybe it should work:
declare
table1_row table1%rowtype;
table2_row table2%rowtype;
table3_row table3%rowtype;
begin
for t1 in(select * from table1
where id between 100 and 10000)
loop
table1_row:=t1;
table1_row.id:=tableseq.nextval;
table1_row.col1:='asdf';
table1_row.col4:='xxx';
insert into table1 values table1_row;
for t2 in(select * from table2
where foreign_key_id =t1.id)
loop
table2_row:=t2;
table2_row.id:=tableseq.nextval;
table2_row.foreign_key_id:=table1_row.id;
table2_row.col3:='gfdgf';
insert into table2 values table2_row;
for t3 in(select * from table3
where foreign_key_id =t2.id)
loop
table3_row:=t3;
table3_row.id:=tableseq.nextval;
table3_row.foreign_key_id:=table2_row.id;
table3_row.col1:='gdfgdg';
insert into table3 values table3_row;
end loop;
end loop;
end loop;
end;
Any better solutions? With about 10-20nested loops, it looks awful :(
Thanks in advance.

I believe you can use an insert statement with several subqueries to clean this up. Here is a simpler example but I believe you can extrapolate for your specific case:
insert into table1
(col1, col2, col3, col4, col5)
values
select 'asdf',
(select table2_data --whatever data from this table you want
from table2
where foreign_key_id =table1.id),
(select table3_data --whatever data from this table you want
from table3
where foreign_key_id =table1.id),
'xxx',
table1.col5
from table1
where table1.id between 100 and 10000
Note, your id column should be set up as an Auto_increment primary key so you shouldn't need it as part of your insert statement. Also, I added "table1.col5" as an example of how to use the same data from the existing row in your duplicated row (as I'm assuming some data you want to be duplicated).

Related

move all values from column in one table to new table and update third table with relation between those tables, PostgreSQL

I am using SQL after a long time and I have following:
I have existing table1 with columns id, name and a lot of other columns, it already contains rows.
I created empty table2 which only has columns id and name.
I created empty table3 which only has reference columns table1_id and table2_id.
Now I want to:
take all the values from column name in table1 (can be NULL, discard them in that case),
insert them as new rows into table2,
insert ids of the corresponding table1 and table2 rows into table3,
remove the column name from table1.
=> probably ALTER TABLE table1 DROP COLUMN name;, but I guess there may be a neater way to cut the result from step 1, transform it and paste as rows in step 2.
EDIT: I came up with something like (not tested yet):
SELECT table1.id, table1.name INTO results FROM table1;
FOR result1 IN
results
LOOP
WITH result2 AS (
INSERT INTO table2 (name) VALUES (result1.name) RETURNING id
)
INSERT INTO table3 (table2_id, table1_id) VALUES (result2.id, result1.id);
END LOOP;
ALTER TABLE table1 DROP COLUMN name;
EDIT:
I forgot to tell that if the name already existed in table2, I don't want to add it again (should be unique in table2), but I add the relation between the id from table1 and from the inserted/existing id from table2 into the table3.
EDIT:
I found we have source scripts for creating the database and I changed it there. Now I don't know how to get rid of this open question :(
For steps 1) & 2):
--Since you already have a table2
DROP TABLE table2;
--Create new table2 with data. Unless you are going to replace NULL with something
--discarding them would just end up with NULL again.
CREATE table2 AS SELECT id, name from table1;
Step 3). Not sure of the purpose of table3 as you would have matching id values between table1 and table2. In fact you could use that to set up a FOREIGN KEY relationship between them.
Step 4) Your solution: ALTER TABLE table1 DROP COLUMN name;
Not sure how you want to use it. If you want to run it as one-time transformation in one bulk, this could help (you can try the code on sqlfiddle):
CREATE TABLE table1 (
id int,
name varchar(9)
);
INSERT INTO table1 (
id,
name
)
VALUES
(1, 'A'),
(2, null),
(3, 'C'),
(4, null),
(5, 'E'),
(6, 'C')
;
CREATE TABLE table2 (
id SERIAL,
name varchar(9) UNIQUE
);
INSERT INTO table2 (name)
SELECT DISTINCT name
FROM table1
WHERE name IS NOT NULL
;
/*
-- This would be better option, but I was not able to test the merge/upsert function of PostgreSQL
INSERT INTO table2 (name)
SELECT name
FROM table1
WHERE name IS NOT NULL
ON CONFLICT table2_name_key DO NOTHING --merge/upsert, supports PostgreSQL 9.5 and newer
;
*/
CREATE TABLE table3 (
id_table1 int,
id_table2 int
) AS
SELECT
t1.id id_table1,
t2.id id_table2
FROM table1 t1
INNER JOIN table2 t2
ON t1.name = t2.name
;
--ALTER TABLE table1 DROP COLUMN name;
This could also be useful:
stackoverflow_1
postgresqltutorial
stackoverflow_2
postgresql documentation with PL/pgSQL code - suggestion you wrote in question is going much more this way

Inserting new rows in table if already not exisitng

I have a table than over time can get bigger and I want to insert some of its rows in another table but I also want to make sure I am not duplicating the rows that I had inserted before.
So here is the type of condition for my insert:
INSERT INTO SecondTable(Col1,Col2)
SELECT Col5,Col6
FROM
FirstTable ft
WHERE ft.RecType = 'ABC'
So if I keep running this it will keep inserting the same rows again and again. How can I tell it only insert if it is not already there?
You can use not exists:
INSERT INTO SecondTable(Col1,Col2)
SELECT Col5,Col6
FROM FirstTable ft
WHERE ft.RecType = 'ABC' AND
NOT EXISTS (SELECT 1 FROM SecondTable t2 WHERE t2.col1 = ft.col5 AND t2.col2 = ft.colt6);
Generate unique constraint on table with proper columns which identifies unicity. This will also help you to preserve integrity of your table. when you try to insert records into the RDBMS will give you an error.
ALTER TABLE SecondTable
ADD UNIQUE (col1, col2, col3);
INSERT INTO SecondTable(Col1,Col2)
SELECT Col5,Col6
FROM FirstTable ft
LEFT JOIN SecondTable st ON st.Col1 = ft.Col1
WHERE st.Col1 IS NULL AND ft.RecType = 'ABC'

Copy data from table to table doing INSERT or UPDATE

I need to copy a lot of data from one table to another. If the data already exists, I need to update it, otherwise I need to insert it. The data to be copied is selecting using a WHERE condition. The data has a primary key (a string of up to 12 characters).
If I was just inserting the data, I would do
INSERT INTO T2 SELECT COL1, COL2 FROM T1 WHERE T1.ID ='I'
but I cannot figure out how to do the INSERT / UPDATE. I keep seeing references to upserts and MERGE, but MERGE appears to have issues,and I cannot figure ut how to do the upsert for multiple records.
What is the best solution for this?
If you want to avoid merges (though you should not be afraid of it) you can do something like
update t2
set col1 = t1.col1
,col2 = t1.col2
from t2
join t1
on t2.[joinkey] = t1.[joinkey]
where [where clause]
And after for the ones that you do not have
insert into t2(col1,col2)
select col1,col2 from t1
where not exists (select * from t2 where t1.[joinkey] = t2.[joinkey])
in such way you first update the ones that match and then insert the ones that do not. Also if you want it in one go you can wrap it in a transaction.
It is commonly known as UPSERT operation. Yes you are correct in saying merge has some issues with it so stay away from it.
A simple approach assuming there is a Primary Key column in Both tables called PK_Col would be something like this...
BEGIN TRANSACTION;
-- Update already existing records
UPDATE T2
SET T2.Col1 = T1.Col1
,T2.Col2 = T1.Col2
FROM T2 INNER JOIN T1 ON T2.PK_COl = T1.PK_Col
-- Insert missing records
INSERT INTO T2 (COL1, COL2 )
SELECT COL1, COL2
FROM T1
WHERE T1.ID ='I'
AND NOT EXISTS (SELECT 1
FROM T2
WHERE T2.PK_COl = T1.PK_Col )
COMMIT TRANSACTION;
Wrap the whole UPSERT operation in one transaction.
You can use IF EXISTS something like:
if exists (select * from table with (updlock,serializable) where key = #key)
begin
update table set ...
where key = #key
end
else
begin
insert table (key, ...)
values (#key, ...)
end
Another solution is to check ##ROWCOUNT
UPDATE MyTable SET FieldA=#FieldA WHERE Key=#Key
IF ##ROWCOUNT = 0
INSERT INTO MyTable (FieldA) VALUES (#FieldA)

Copy data from one column to another from different tables in sqlite

I want to copy data to column A in Table1 from column B in Table2. Rows for column A are empty and there are exists other columns in Table1 with already populated data. So I need to grab the whole column B from Table2 and insert all those values in column A in Table1. The two table are completely identical, except that column A has no values at all.
How do I do this in sqlite3?
Use:
INSERT INTO TABLE1
SELECT B,
NULL,
NULL,
NULL
FROM TABLE2
Use NULL as the placeholder for however many columns you can't populate from TABLE2, assuming TABLE1 columns allow NULL values.
UPDATE TABLE1 SET A = (SELECT B FROM TABLE2 WHERE ...)
Come to think of it, if the tables are truly identical, why do you need two of them? In any case you can also do this:
BEGIN;
DELETE FROM TABLE1;
INSERT INTO TABLE1 (A, col1, col2, ...) SELECT (B, col2, col2, ...) FROM TABLE2;
COMMIT;
Try this:
INSERT INTO TABLE1 (A) SELECT B FROM TABLE2

Avoid duplicates in INSERT INTO SELECT query in SQL Server

I have the following two tables:
Table1
----------
ID Name
1 A
2 B
3 C
Table2
----------
ID Name
1 Z
I need to insert data from Table1 to Table2. I can use the following syntax:
INSERT INTO Table2(Id, Name) SELECT Id, Name FROM Table1
However, in my case, duplicate IDs might exist in Table2 (in my case, it's just "1") and I don't want to copy that again as that would throw an error.
I can write something like this:
IF NOT EXISTS(SELECT 1 FROM Table2 WHERE Id=1)
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1
ELSE
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1 WHERE Table1.Id<>1
Is there a better way to do this without using IF - ELSE? I want to avoid two INSERT INTO-SELECT statements based on some condition.
Using NOT EXISTS:
INSERT INTO TABLE_2
(id, name)
SELECT t1.id,
t1.name
FROM TABLE_1 t1
WHERE NOT EXISTS(SELECT id
FROM TABLE_2 t2
WHERE t2.id = t1.id)
Using NOT IN:
INSERT INTO TABLE_2
(id, name)
SELECT t1.id,
t1.name
FROM TABLE_1 t1
WHERE t1.id NOT IN (SELECT id
FROM TABLE_2)
Using LEFT JOIN/IS NULL:
INSERT INTO TABLE_2
(id, name)
SELECT t1.id,
t1.name
FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2 ON t2.id = t1.id
WHERE t2.id IS NULL
Of the three options, the LEFT JOIN/IS NULL is less efficient. See this link for more details.
In MySQL you can do this:
INSERT IGNORE INTO Table2(Id, Name) SELECT Id, Name FROM Table1
Does SQL Server have anything similar?
I just had a similar problem, the DISTINCT keyword works magic:
INSERT INTO Table2(Id, Name) SELECT DISTINCT Id, Name FROM Table1
I was facing the same problem recently...
Heres what worked for me in MS SQL server 2017...
The primary key should be set on ID in table 2...
The columns and column properties should be the same of course between both tables. This will work the first time you run the below script. The duplicate ID in table 1, will not insert...
If you run it the second time, you will get a
Violation of PRIMARY KEY constraint error
This is the code:
Insert into Table_2
Select distinct *
from Table_1
where table_1.ID >1
Using ignore Duplicates on the unique index as suggested by IanC here was my solution for a similar issue, creating the index with the Option WITH IGNORE_DUP_KEY
In backward compatible syntax
, WITH IGNORE_DUP_KEY is equivalent to WITH IGNORE_DUP_KEY = ON.
Ref.: index_option
From SQL Server you can set a Unique key index on the table for (Columns that needs to be unique)
A little off topic, but if you want to migrate the data to a new table, and the possible duplicates are in the original table, and the column possibly duplicated is not an id, a GROUP BY will do:
INSERT INTO TABLE_2
(name)
SELECT t1.name
FROM TABLE_1 t1
GROUP BY t1.name
In my case, I had duplicate IDs in the source table, so none of the proposals worked. I don't care about performance, it's just done once.
To solve this I took the records one by one with a cursor to ignore the duplicates.
So here's the code example:
DECLARE #c1 AS VARCHAR(12);
DECLARE #c2 AS VARCHAR(250);
DECLARE #c3 AS VARCHAR(250);
DECLARE MY_cursor CURSOR STATIC FOR
Select
c1,
c2,
c3
from T2
where ....;
OPEN MY_cursor
FETCH NEXT FROM MY_cursor INTO #c1, #c2, #c3
WHILE ##FETCH_STATUS = 0
BEGIN
if (select count(1)
from T1
where a1 = #c1
and a2 = #c2
) = 0
INSERT INTO T1
values (#c1, #c2, #c3)
FETCH NEXT FROM MY_cursor INTO #c1, #c2, #c3
END
CLOSE MY_cursor
DEALLOCATE MY_cursor
I used a MERGE query to fill a table without duplications.
The problem I had was a double key in the tables ( Code , Value ) ,
and the exists query was very slow
The MERGE executed very fast ( more then X100 )
examples for MERGE query
For one table it works perfectly when creating one unique index from multiple field. Then simple "INSERT IGNORE" will ignore duplicates if ALL of 7 fields (in this case) will have SAME values.
Select fields in PMA Structure View and click Unique, new combined index will be created.
A simple DELETE before the INSERT would suffice:
DELETE FROM Table2 WHERE Id = (SELECT Id FROM Table1)
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1
Switching Table1 for Table2 depending on which table's Id and name pairing you want to preserve.