postgresql how to update another table while on conflict - sql

I have two tables
CREATE TABLE people (
name VARCHAR(100) NOT NULL,
company_id int8 NOT NULL,
);
CREATE TABLE company (
id int8 NOT NULL,
);
I want to copy data from csv to DB. This is my script
BEGIN
CREATE TEMP TABLE tmp_company
ON COMMIT DROP AS SELECT * FROM company WITH NO DATA;
\COPY tmp_company FROM 'company.csv' WITH CSV HEADER
DELIMITER as ',';
INSERT INTO company
SELECT * FROM tmp_company
ON CONFLICT DO NOTHING;
CREATE TEMP TABLE tmp_people
ON COMMIT DROP AS SELECT * FROM people WITH NO DATA;
\COPY tmp_people FROM 'people.csv' WITH CSV HEADER
DELIMITER as ',';
INSERT INTO people
SELECT * FROM tmp_people
ON CONFLICT DO NOTHING;
COMMIT;
If existing company id is found in company table,
I should do company.id+=1 and replace the new company_id for the related people records.
Example:
company.csv
id
1
5
people.csv
name,company_id
tom,1
paul,5
existing company table data
id
1
2
existing people table data
name,company_id
tom,1
paul,2
After copying data from csv to DB, the data should look like
company table data
id
1
2
3 <-- from csv data, as 1,2 are used, set id=3
5
people table data
name,company_id
tom,1
paul,2
tom,3 <-- from csv data
paul,5 <-- from csv data
How can i do this? I am wondering if I can add the logic after ON CONFLICT...
Edit 1:
The two tables' size are near 5TBs.
and two csv contains 5M records.

First, you should use the bigserial data type instead of int8 for the column id of the table company so that to automatically increase the id when inserting a new row.
Then, you should create a foreign key between your tables people and company with the option ON UPDATE CASCADE so that any change in the id column of the table company will be automatically propagated to the company_id column in the table people.
CREATE TABLE company (
id bigserial NOT NULL
);
CREATE TABLE people (
name VARCHAR(100) NOT NULL,
company_id int8 NOT NULL,
CONSTRAINT fkey FOREIGN KEY company_id REFERENCES company(id) ON UPDATE CASCADE
);

Related

Creating a delete trigger with archive table for records and all of its related records in SQL

I need to create a deleting trigger with archive table so when I delete a record from one table it goes into archive table plus its related records from other tables in SQL.
I have 4 tables called Songs, Genres, Performers and Directors plus relations which are SongPerformers connecting Songs and Performers, relation SongGenres connecting Songs and Genres and SongDirectors connecting Songs and Directors.
I created the delete trigger for the Songs table and archive table too. The problem is that when deleting from the Songs table, the records that are deleted are just in Songs table but doesn't delete the related records from other tables.
Here is the database diagram of how the database looks and the trigger I created:
Database diagram
Trigger:
CREATE TABLE Archives
(
Id INT PRIMARY KEY NOT NULL IDENTITY,
SongTitle VARCHAR (50) NOT NULL,
SongReleaseDate date NOT NULL,
SongTime float NOT NULL,
SongLanguage VARCHAR (50) NOT NULL,
Date DATETIME NOT NULL
)
CREATE TRIGGER Archivetrigger
ON Songs
AFTER DELETE
AS
INSERT INTO Archives (SongTitle, SongReleaseDate, SongTime, SongLanguage, [Date])
SELECT
d.SongTitle, d.SongReleaseDate, d.SongTime, d.SongLanguage, GETDATE()
FROM
deleted d
DELETE FROM Songs
WHERE SongID = 1
SELECT * FROM Archives
Like how can I add the other tables in the trigger so when I delete a record from Songs all of its related records are deleted too and added to the archive table.
So i found the answer that works for my case (it's not perfect but it works), all i had to do in the trigger was to add some lines that help delete the related records from the other tables:
CREATE TABLE Archives
(
ID INT PRIMARY KEY NOT NULL IDENTITY,
SongTitle VARCHAR (50) NOT NULL,
SongReleaseDate date NOT NULL,
SongTime float NOT NULL,
SongLanguage VARCHAR (50) NOT NULL,
Date DATETIME NOT NULL
)
CREATE TRIGGER Archivetrigger
ON Songs
AFTER delete
AS
BEGIN
INSERT INTO Archives (SongTitle, SongReleaseDate, SongTime, SongLanguage, [Date])
SELECT d.SongTitle, d.SongReleaseDate, d.SongTime,d.SongLanguage,GETDATE()
FROM deleted d
DELETE FROM SongPerformers WHERE SongID = (SELECT TOP 1 SongID FROM deleted)
DELETE FROM SongDirectors WHERE SongID = (SELECT TOP 1 SongID FROM deleted)
DELETE FROM SongGenres WHERE SongID = (SELECT TOP 1 SongID FROM deleted)
END
Here is the syntax, it's obviously not perfect, but its close to what i need and works for me:
CREATE TRIGGER [dbo].[*nameoftrigger*]
ON [dbo].[*nameoftable*]
AFTER DELETE
AS
BEGIN
DELETE FROM *tableyoudeletefrom* WHERE *tableID* = (SELECT TOP 1 *tableID* FROM DELETED)
END
You can add all the table you need to be deleted from.

How to merge rows of one table to another while keeping foreign key constraints on autogenerated columns?

Here are two tables that I have, with Table B referencing Table A:
CREATE TABLE TableA
(
[Id_A] [bigint] IDENTITY(1,1) NOT NULL,
...
CONSTRAINT [PK_TableA_Id_A] PRIMARY KEY CLUSTERED
(
[Id_A] ASC
)
)
CREATE TABLE TableB
(
[Id_B] [bigint] IDENTITY(1,1) NOT NULL,
[RefId_A] [bigint] NOT NULL
...
CONSTRAINT [PK_TableB_Id_B] PRIMARY KEY CLUSTERED
(
[Id_B] ASC
)
)
ALTER TABLE [dbo].[TableB] WITH CHECK ADD CONSTRAINT [FK_Id_A] FOREIGN KEY([RefId_A])
REFERENCES [dbo].[TableA] ([Id_A])
These two tables are part of 2 databases.
Table A and Table B in database 1;
Table A and Table B in database 2.
I need to merge the rows of Table A from database 1 into Table A of database 2 and the rows of Table B from database 1 into Table B of database 2.
I used the SQL Data Import and Export Wizard , checked the Enable Identity Insert option but it fails:
An OLE DB record is available. Source: "Microsoft SQL Server Native
Client 11.0" Hresult: 0x80004005 Description: "Violation of PRIMARY
KEY constraint 'PK_TableB_Id_B'. Cannot insert duplicate key in object
'dbo.TableB'. The duplicate key value is (1).". (SQL Server Import and
Export Wizard)
Which seems to make sense. There are rows in Table B of database 1 that have the same auto-generated PK as rows of Table B in database 2.
QUESTION
In this scenario, how can I merge the tables content from database 1 to the tables of database 2 while maintaining the foreign key constraints?
You can try something like the following. In here we assume that you need to insert all records as new ones (and not compare if some already exist or not). I wrapped both operations in a transaction to ensure that both go OK or none at all.
BEGIN TRY
IF OBJECT_ID('tempdb..#IdentityRelationships') IS NOT NULL
DROP TABLE #IdentityRelationships
CREATE TABLE #IdentityRelationships (
OldIdentity INT,
NewIdentity INT)
BEGIN TRANSACTION
;WITH SourceData AS
(
SELECT
OldIdentity = A.Id_A,
OtherColumn = A.OtherColumn
FROM
Database1.Schema.TableA AS A
)
MERGE INTO
Database2.Schema.TableA AS T
USING
SourceData AS S ON 1 = 0 -- Will always execute the "WHEN NOT MATCHED" operation
WHEN NOT MATCHED THEN
INSERT (
OtherColumn)
VALUES (
S.OtherColumn)
OUTPUT
inserted.Id_A, -- "MERGE" clause can output non-inserted values
S.ID_A
INTO
#IdentityRelationships (
NewIdentity,
OldIdentity);
INSERT INTO Database2.Schema.TableB (
RefId_A,
OtherData)
SELECT
RefId_A = I.NewIdentity,
OtherData = T.OtherData
FROM
Database1.Schema.TableB AS T
INNER JOIN #IdentityRelationships AS I ON T.RefID_A = I.OldIdentity
COMMIT
END TRY
BEGIN CATCH
DECLARE #v_ErrorMessage VARCHAR(MAX) = CONVERT(VARCHAR(MAX), ERROR_MESSAGE())
IF ##TRANCOUNT > 0
ROLLBACK
RAISERROR (#v_ErrorMessage, 16, 1)
END CATCH
This is too long for a comment.
There is no simple way to do this. Your primary keys are identity columns that both start at "1", so the relationships are ambiguous.
You have two options:
A composite primary key, identifying the database source of the records.
A new primary key. You can preserve the existing primary key values from one database.
Your question doesn't provide enough information to say which is the better approach: "merge" is not clearly defined.
I might suggest that you just recreate all the tables. Insert all the rows from table A into a new table. Add a new identity primary key. Keep the original primary key and source.
Then bring the data from Table B into a new table, looking up the new primary key in the new Table A. At this point, the new Table B is finished, except for defining the primary key constraint.
Then drop the unnecessarily columns in the new table A.

How to create a table with ONE existing row from another table?

I'm frankly new to sql and this is a project I'm doing.
I would like to know if there's a way to connect one column in one table to another table when creating tables. I know of the join method to show results of, but I want to minimized my code as possible.
CREATE TABLE players (
id INT PRIMARY KEY, -->code I want connect with table match_record
player_name CHARACTER
);
CREATE TABLE match_records (
(id INT PRIMARY KEY /*FROM players*/), --> the code I want it to be here
winner INT,
loser INT
);
CREATE TABLE players (
id INT not null PRIMARY KEY, -->code I want connect with table match_record
player_name CHARACTER
);
CREATE TABLE match_records (
id INT not null PRIMARY KEY references players(id), --> the code I want it to be here
winner INT,
loser INT
);
this way you restrict that match_records.id is only from players.id:
t=# insert into match_records select 1,1,0;
ERROR: insert or update on table "match_records" violates foreign key constraint "match_records_id_fkey"
DETAIL: Key (id)=(1) is not present in table "players".
So I add players:
t=# insert into players(id) values(1),(2);
INSERT 0 2
And now it allows insert:
t=# insert into match_records select 1,1,0;
INSERT 0 1
update
https://www.postgresql.org/docs/current/static/app-psql.html#APP-PSQL-PROMPTING
%#
If the session user is a database superuser, then a #, otherwise a >.
(The expansion of this value might change during a database session as
the result of the command SET SESSION AUTHORIZATION.)
in this way:
CREATE TABLE new_table as SELECT id,... from old_table where id = 1;

Adding a NOT NULL column to a Redshift table

I'd like to add a NOT NULL column to a Redshift table that has records, an IDENTITY field, and that other tables have foreign keys to.
In PostgreSQL, you can add the column as NULL, fill it in, then ALTER it to be NOT NULL.
In Redshift, the best I've found so far is:
ALTER TABLE my_table ADD COLUMN new_column INTEGER;
-- Fill that column
CREATE TABLE my_table2 (
id INTEGER IDENTITY NOT NULL SORTKEY,
(... all the fields ... )
new_column INTEGER NOT NULL,
PRIMARY KEY(id)
) DISTSTYLE all;
UNLOAD ('select * from my_table')
to 's3://blah' credentials '<aws-auth-args>' ;
COPY my_table2
from 's3://blah' credentials '<aws-auth-args>'
EXPLICIT_IDS;
DROP table my_table;
ALTER TABLE my_table2 RENAME TO my_table;
-- For each table that had a foreign key to my_table:
ALTER TABLE another_table ADD FOREIGN KEY(my_table_id) REFERENCES my_table(id)
Is this the best way of achieving this?
You can achieve this w/o having to load to S3.
modify the existing table to create the desired column w/ a default value
update that column in some way (in my case it was copying from another column)
create a new table with the column w/o a default value
insert into the new table (you must list out the columns rather than using (*) since the order may be the same (say if you want the new column in position 2)
drop the old table
rename the table
alter table to give correct owner (if appropriate)
ex:
-- first add the column w/ a default value
alter table my_table_xyz
add visit_id bigint NOT NULL default 0; -- not null but default value
-- now populate the new column with whatever is appropriate (the key in my case)
update my_table_xyz
set visit_id = key;
-- now create the new table with the proper constraints
create table my_table_xzy_new
(
key bigint not null,
visit_id bigint NOT NULL, -- here it is not null and no default value
adt_id bigint not null
);
-- select all from old into new
insert into my_table_xyz_new
select key, visit_id, adt_id
from my_table_xyz;
-- remove the orig table
DROP table my_table_xzy_events;
-- rename the newly created table to the desired table
alter table my_table_xyz_new rename to my_table_xyz;
-- adjust any views, foreign keys or permissions as required

Restoring a Truncated Table from a Backup

I am restoring the data of a truncated table in an Oracle Database from an exported csv file. However, I find that the primary key auto-increments and does not insert the actual values of the primary key constrained column from the backed up file.
I intend to do the following:
1. drop the primary key
2. import the table data
3. add primary key constraints on the required column
Is this a good approach? If not, what is recommended? Thanks.
EDIT: After more investigation, I observed there's a trigger to generate nextval on a sequence to be inserted into the primary key column. This is the source of the predicament. Hence, following the procedure above would not solve the problem. It lies in the trigger (and/or sequence) on the table. This is solved!
easier to use your .csv as an external table and then go
create table your_table_temp as select * from external table
examine the data in the new temp table to ensure you know what range of primary keys is present
do a merge into the new table
samples from here and here
CREATE TABLE countries_ext (
country_code VARCHAR2(5),
country_name VARCHAR2(50),
country_language VARCHAR2(50)
)
ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY ext_tab_data
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY ','
MISSING FIELD VALUES ARE NULL
(
country_code CHAR(5),
country_name CHAR(50),
country_language CHAR(50)
)
)
LOCATION ('Countries1.txt','Countries2.txt')
)
PARALLEL 5
REJECT LIMIT UNLIMITED;
and the merge
MERGE INTO employees e
USING hr_records h
ON (e.id = h.emp_id)
WHEN MATCHED THEN
UPDATE SET e.address = h.address
WHEN NOT MATCHED THEN
INSERT (id, address)
VALUES (h.emp_id, h.address);
Edit: after you have merged the data you can drop the temp table and the result is your previous table with the old data and the new data together
Edit you mention " During imports, the primary key column does not insert from the file, but auto-increments". This can only happen when there is a trigger on the table, likely, Before insert on each row. Disable the trigger and then do your import. Re-enable the trigger after committing your inserts.
I used the following procedure to solve it:
drop trigger trigger_name
Imported the table data into target table
drop sequence sequence_name
CREATE SEQUENCE SEQ_NAME INCREMENT BY 1 START WITH start_index_for_next_val MAXVALUE max_val MINVALUE 1 NOCYCLECACHE 20 NOORDER
CREATE OR REPLACE TRIGGER "schema_name"."trigger_name"
before insert on target_table
for each row
begin
select seq_name.nextval
into :new.unique_column_name
from dual;
end;