Moving table columns to new table and referencing as foreign key in PostgreSQL - sql

Suppose we have a DB table with fields
"id", "category", "subcategory", "brand", "name", "description", etc.
What's a good way of creating separate tables for
category, subcategory and brand
and the corresponding columns and rows in the original table becoming foreign key references?
To outline the operations involved:
get all unique values in each column of the original table which should become foreign keys;
create tables for those
create foreign key reference columns in the original table (or a copy)
In this case, the PostgreSQL DB is accessed via Sequel in a Ruby app, so available interfaces are the command line, Sequel, PGAdmin, etc...
The question: how would you do this?

-- Some test data
CREATE TABLE animals
( id SERIAL NOT NULL PRIMARY KEY
, name varchar
, category varchar
, subcategory varchar
);
INSERT INTO animals(name, category, subcategory) VALUES
( 'Chimpanzee' , 'mammals', 'apes' )
,( 'Urang Utang' , 'mammals', 'apes' )
,( 'Homo Sapiens' , 'mammals', 'apes' )
,( 'Mouse' , 'mammals', 'rodents' )
,( 'Rat' , 'mammals', 'rodents' )
;
-- [empty] table to contain the "squeezed out" domain
CREATE TABLE categories
( id SERIAL NOT NULL PRIMARY KEY
, category varchar
, subcategory varchar
, UNIQUE (category,subcategory)
);
-- The original table needs a "link" to the new table
ALTER TABLE animals
ADD column category_id INTEGER -- NOT NULL
REFERENCES categories(id)
;
-- FK constraints are helped a lot by a supportive index.
CREATE INDEX animals_categories_fk ON animals (category_id);
-- Chained query to:
-- * populate the domain table
-- * initialize the FK column in the original table
WITH ins AS (
INSERT INTO categories(category, subcategory)
SELECT DISTINCT a.category, a.subcategory
FROM animals a
RETURNING *
)
UPDATE animals ani
SET category_id = ins.id
FROM ins
WHERE ins.category = ani.category
AND ins.subcategory = ani.subcategory
;
-- Now that we have the FK pointing to the new table,
-- we can drop the redundant columns.
ALTER TABLE animals DROP COLUMN category, DROP COLUMN subcategory;
-- show it to the world
SELECT a.*
, c.category, c.subcategory
FROM animals a
JOIN categories c ON c.id = a.category_id
;
Note: the fragment:
WHERE ins.category = ani.category
AND ins.subcategory = ani.subcategory
will lead to problems if these columns contain NULLs.
It would be better to compare them using
(ins.category,ins.subcategory)
IS NOT DISTINCT FROM
(ani.category,ani.subcategory)

I'm not sure I completely understand your question, if this doesn't seem to answer it, then please leave a comment and possibly improve your question to clarify, but it sounds like you want to do a CREATE TABLE xxx AS. For example:
CREATE TABLE category AS (SELECT DISTINCT(category) AS id FROM parent_table);
Then alter the parent_table to add a foreign key constraint.
ALTER TABLE parent_table ADD CONSTRAINT category_fk FOREIGN KEY (category) REFERENCES category (id);
Repeat this for each table you want to create.
Here is the related documentation:
CREATE TABLE
ALTER TABLE
Note: code and references are for Postgresql 9.4

Related

How to select from table A and then insert selected id inside table B with one query?

I'm trying to implement a very basic banking system.
the goal is to have different types of transactions ( deposit, withdraw, transfer ) inside a table and refer to them as IDs inside transaction tables.
CREATE TABLE transaction_types (
id INTEGER AUTO_INCREMENT PRIMARY KEY,
name VARCHAR UNIQUE NOT NULL
)
CREATE TABLE transactions (
id INTEGER AUTO_INCREMENT PRIMARY KEY,
type_id INTEGER NOT NULL,
amount FLOAT NOT NULL
)
What I'm trying to accomplish is:
When inserting into transactions table no record can have an invalid type_id ( type_id should exist in transaction_types table )
First of all get type_id from transaction_types table and then insert inside transactions table, with one query ( if it's possible, I'm fairly new )
I'm using Node.js/Typescript and PostgreSQL, any help is appreciated a lot.
For (1): modify Transactions table definition by adding REFERENCES transaction_types(id) to the end of the type_id column definition prior to the comma.
For (2), assuming you know the name of the transaction_type, you can accomplish this by:
INSERT INTO transactions(type_id, amount)
VALUES ((SELECT id from transaction_types WHERE name = 'Withdrawal'), 999.99)
By the way, my PostgreSQL requires SERIAL instead of INTEGER AUTOINCREMENT

How can I insert a row that references another postgres table via foreign key, and creates the foreign row too if it doesn't exist?

In Postgres, is there a way to atomically insert a row into a table, where one column references another table, and we look up to see if the desired row exists in the referenced table and inserts it as well if it is not?
For example, say we have a US states table and a cities table which references the states table:
CREATE TABLE states (
state_id serial primary key,
name text
);
CREATE TABLE cities (
city_id serial,
name text,
state_id int references states(state_id)
);
When I want to add the city of Austin, Texas, I want to be able to see whether Texas exists in the states table, and if so use its state_id in the new row I'm inserting in the cities table. If Texas doesn't exist in the states table, I want to create it and then use its id in the cities table.
I tried this query, but I got an error saying
ERROR: WITH clause containing a data-modifying statement must be at the top level
LINE 2: WITH inserted AS (
^
WITH state_id AS (
WITH inserted AS (
INSERT INTO states(name)
VALUES ('Texas')
ON CONFLICT DO NOTHING
RETURNING state_id),
already_there AS (
SELECT state_id FROM states
WHERE name='Texas')
SELECT * FROM inserted
UNION
SELECT * FROM already_there)
INSERT INTO cities(name, state_id)
VALUES
('Austin', (SELECT state_id FROM state_id));
Am I overlooking a simple solution?
Here is one option:
with inserted as (
insert into states(name) values ('Texas')
on conflict do nothing
returning state_id
)
insert into cities(name, state_id)
values (
'Dallas',
coalesce(
(select state_id from inserted),
(select state_id from states where name = 'Texas')
)
);
The idea is to attempt to insert in a CTE, and then, in the main insert, check if a value was inserted, else select it.
For this to work properly, you need a unique constraint on states(name):
create table states (
state_id serial primary key,
name text unique
);
Demo on DB Fiddlde
You can force the insert statement to return a value:
WITH inserted AS (
INSERT INTO states (name)
VALUES ('Texas')
ON CONFLICT (name) DO UPDATE SET name = EXCLUDED.NAME
RETURNING state_id
)
. . .
The DO UPDATE SET forces the INSERT to return something.
I notice that you don't have a unique constraint, so you also need that:
ALTER TABLE states ADD CONSTRAINT unq_state_name
UNIQUE (name);
Otherwise the ON CONFLICT doesn't have anything to work with.

Multiple autoincrement ids based on table column

I need help in database design.
I have following tables.
Pseudo code:
Table order_status {
id int[pk, increment]
name varchar
}
Table order_status_update {
id int[pk, increment]
order_id int[ref: > order.id]
order_status_id int[ref: > order_status.id]
updated_at datetime
}
Table order_category {
id int[pk, increment]
name varchar
}
Table file {
id int[pk, increment]
order_id int[ref: > order.id]
key varchar
name varchar
path varchar
}
Table order {
id int [pk] // primary key
order_status_id int [ref: > order_status.id]
order_category_id int [ref: > order_category.id]
notes varchar
attributes json // no of attributes is not fixed, hence needed a json column
}
Everything was okay, but now I need an auto-increment id for each type of order_category_id column.
For example, if I have 2 categories electronics and toys , then I would need electronics-1, toy-1, toy-2, electronics-2, electronics-3, toy-3, toy-4, toy-5 values associated with rows of order table. But it's not possible as auto-increment increments based on each new row, not column type.
In other words, for table order instead of
id order_category_id
---------------------
1 1
2 1
3 1
4 2
5 1
6 2
7 1
I need following,
id order_category_id pretty_ids
----------------------------
1 1 toy-1
2 1 toy-2
3 1 toy-3
4 2 electronics-1
5 1 toy-4
6 2 electronics-2
7 1 toy-5
What I tried:
I created separate table for each order category (not an ideal solution but currently I have 6 order categories, so it works for now )
Now, I have table for electronics_order and toys_order. Columns are repetitive, but it works. But now I have another problem, my every relationship with other tables got ruined. Since, both electronics_order and toys_orders can have same id, I cannot use id column to reference order_status_update, order_status, file tables.
I can create another column order_category in each of these tables, but will it be the right way? I am not experienced in database design, so I would like to know how others do it.
I also have a side question.
Do I need tables for order_category and order_status just to store names? Because these values will not change much and I can store them in code and save in columns of order table.
I know separate tables are good for flexibility, but I had to query database 2 times to fetch order_status and order_category by name before inserting new row to order table. And later it will be multiple join for querying order table.
--
If it helps, I am using flask-sqlalchemy in backend and postgresql as database server.
In order to track the increment id which is based on the order_category, we can keep track of this value on another table. Let us call this table: order_category_sequence. To show my solution, I just created simplified version of order table with order_category.
CREATE TABLE order_category (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NULL
);
CREATE TABLE order_category_sequence (
id SERIAL PRIMARY KEY,
order_category_id int NOT NULL,
current_key int not null
);
Alter Table order_category_sequence Add Constraint "fk_order_category_id" FOREIGN KEY (order_category_id) REFERENCES order_category (id);
Alter Table order_category_sequence Add Constraint "uc_order_category_id" UNIQUE (order_category_id);
CREATE TABLE "order" (
id SERIAL PRIMARY KEY,
order_category_id int NOT NULL,
pretty_id VARCHAR(100) null
);
Alter Table "order" Add Constraint "fk_order_category_id" FOREIGN KEY (order_category_id) REFERENCES order_category (id);
The order_category_id column in order_category_sequence table refers the order_category. The current_key column holds the last value in order.
When a new order row is added, we can use a trigger to read the last value from order_category_sequence and update pretty_id. The following trigger definition can be used to achieve this.
--function called everytime a new order is added
CREATE OR REPLACE FUNCTION on_order_created()
RETURNS trigger AS
$BODY$
DECLARE
current_pretty_id varchar(100);
BEGIN
-- increment last value of the corresponding order_category_id in the sequence table
Update order_category_sequence
set current_key = (current_key + 1)
where order_category_id = NEW.order_category_id;
--prepare the pretty_id
Select
oc.name || '-' || s.current_key AS current_pretty_id
FROM order_category_sequence AS s
JOIN order_category AS oc on s.order_category_id = oc.id
WHERE s.order_category_id = NEW.order_category_id
INTO current_pretty_id;
--update order table
Update "order"
set pretty_id = current_pretty_id
where id = NEW.id;
RETURN NEW;
END;
$BODY$ LANGUAGE plpgsql;
CREATE TRIGGER order_created
AFTER INSERT
ON "order"
FOR EACH ROW
EXECUTE PROCEDURE on_order_created();
If we want to synchronize the two table, order_category and order_category_sequence, we can use another trigger to have a row in the latter table every time a new order category is added.
//function called everytime a new order_category is added
CREATE OR REPLACE FUNCTION on_order_category_created()
RETURNS trigger AS
$BODY$
BEGIN
--insert a new row for the newly inserted order_category
Insert into order_category_sequence(order_category_id, current_key)
values (NEW.id, 0);
RETURN NEW;
END;
$BODY$ LANGUAGE plpgsql;
CREATE TRIGGER order_category_created
AFTER INSERT
ON order_category
FOR EACH ROW
EXECUTE PROCEDURE on_order_category_created();
Testing query and result:
Insert into order_category(name)
values ('electronics'),('toys');
Insert into "order"(order_category_id)
values (1),(2),(2);
select * from "order";
Regarding your side question, I prefer to store the lookup values like order_status and order_category in separate tables. Doing this allows to have the above flexibility and it is easy when we have changes.
To answer your side question: yes, you should keep tables with names in them, for a number of reasons. First of all, such tables are small and generally kept in memory by the database, so there is negligible performance benefit to not using the tables. Second, you want to be able to use external tools to query the database and generate reports, and you want these kind of labels available to those tools. Third, you want to minimize the coupling of your software to the actual data so that they can evolve independently. Adding a new category should not require modifying your software.
Now, to the main question, there is no built-in facility for the kind of auto-increment you want. You have to build it yourself.
I suggest you keep the sequence number for each category as a column in the category table. Then you can update it and use the updated sequence number in the order table, like this (which is specific to PostgreSQL):
-- set up the tables
create table orders (
id SERIAL PRIMARY KEY,
order_category_id int,
pretty_id VARCHAR
);
create unique index order_category_pretty_id_idx
on orders (pretty_id);
create table order_category (
id SERIAL PRIMARY KEY,
name varchar NOT NULL,
seq int NOT NULL default 0
);
-- create the categories
insert into order_category
(name) VALUES
('toy'), ('electronics');
-- create orders, specifying the category ID and generating the pretty ID
WITH
new_category_id (id) AS (VALUES (1)), -- 1 here is the category ID for the new order
pretty AS (
UPDATE order_category
SET seq = seq + 1
WHERE id = (SELECT id FROM new_category_id)
RETURNING *
)
INSERT into orders (order_category_id, pretty_id)
SELECT new_category_id.id, concat(pretty.name, '-', pretty.seq)
FROM new_category_id, pretty;
You just plug in your category ID where I have 1 in the example and it will create the new pretty_id for that category. The first category will be toy-1, the next toy-2, etc.
| id | order_category_id | pretty_id |
| --- | ----------------- | ------------- |
| 1 | 1 | toy-1 |
| 2 | 1 | toy-2 |
| 3 | 2 | electronics-1 |
| 4 | 1 | toy-3 |
| 5 | 2 | electronics-2 |
In order to do toys-1 toys-2 toys-3 you should repeat the logic of order_status update, There is no difference between track some status by time or by count.
Just in the order_status update it is simpler you just put now() into updated_at for lets say order_category_track you would take last value + 1 or have different sequences respectively category (would not recommend to do like this because it binds database objects with data in the DB).
I would change a schema to:
In this schema might be in inconsistent state. But in my opinion in your application there are three different entities "order","order_status","order category track" which live their own lives.
And still it is almost impossible to achieve consistent state for this task with out locks for example. This task is complicated by condition that next rows depends on previous what contradicts with SQL.
I would suggest to split category into 2-level hierarchy: category (toy, electronic) and subcategory (toy-1, toy-2, electronic-1, etc.):
So you can use column order_subcategory.full_name contain compiled "toy-1" value, or you can create view to make this field on the fly:
select oc.name || "-" || os.number
from order_category as oc
join order_subcategory as os on oc.id = os.category_id
https://dbdiagram.io/d/5dd6a132edf08a25543e34f8
Regarding your questions "Do I need tables for order_category and order_status just to store names?":
It is best practice to store this kind of data as a separate dictionary table. It gives you consistency and reliability. Querying those tables is very fast and easy for RDBMS, so feel free to use it.
I'll focus on only 3 tables you showed: order, order_status and order_category.
Creating a new table for a new record is not the right way. As your explanation, I think you trying to use order and order_category tables as many to many relationship. If it's so, the thing you need is a pivot table like this:
I currently add order_status column in order table,
you can add this column one of these tables as your need.
side question:
for order_status, if order status is fixed,( like only ACTIVE,INACTIVE and it won't be more values in the future) it would be better to use a column with ENUM type.
The easy answer would be to answer directly to your question. But I do not think it is a good thing in this case. So I will do otherwise.
I think that maybe the whole conception is wrong.
First things first : clarification of your business needs and assertions.
One order can have multiple categories
One category can concern multiple orders
One order can only have one status at a time but multiple through time
One status can be used by multiple orders
One order correspond to a file (probably a billing proof)
One file concerns only one order
Second : Advices
There is a little amount of reserved key words that you must not use in production environment. (https://www.postgresql.org/docs/current/sql-keywords-appendix.html). So for example I replace the word 'order' by 'command'.
Remaining questions that mandatory needs an answer before production : why the attributes attribute in your 'order' table? There is a risk of non respect of normal forms here. (https://www.geeksforgeeks.org/normal-forms-in-dbms/)
Third : conception solution
This normally is enough to give you a good start. But I wanna have fun a little more :) So...
Fourth : interrogation on needed performance
estimation of load per day/month in order (ten million rows per month?)
Fifth : physical solution proposition
Archiving in another tablespace (trigger when cancel or terminated => archived)
Indexes in another tablespace (your dba will thank you for that)
Possible partitionning of order table (https://pgxn.org/dist/pg_partman/doc/pg_partman.html, https://www.postgresql.org/docs/current/ddl-partitioning.html)
Hardware and option choosings (high availibility? disaster management? if it is: the elaboration needs further study but few)
Data transposition (is it really needed? if it is: the elaboration needs further study but few)
The finaaaaaal code-down ! (with the good music)
-- as a postgres user
CREATE DATABASE command_system;
CREATE SCHEMA in_prgoress_command;
CREATE SCHEMA archived_command;
--DROP SCHEMA public;
-- create tablespaces on other location than below
CREATE TABLESPACE command_indexes_tbs location 'c:/Data/indexes';
CREATE TABLESPACE archived_command_tbs location 'c:/Data/archive';
CREATE TABLESPACE in_progress_command_tbs location 'c:/Data/command';
CREATE TABLE in_prgoress_command.command
(
id bigint /*or bigserial if you use a INSERT RETURNING clause*/ primary key
, notes varchar(500)
, fileULink varchar (500)
)
TABLESPACE in_progress_command_tbs;
CREATE TABLE archived_command.command
(
id bigint /*or bigserial if you use a INSERT RETURNING clause*/ primary key
, notes varchar(500)
, fileULink varchar (500)
)
TABLESPACE archived_command_tbs;
CREATE TABLE in_prgoress_command.category
(
id int primary key
, designation varchar(45) NOT NULL
)
TABLESPACE in_progress_command_tbs;
INSERT INTO in_prgoress_command.category
VALUES (1,'Toy'), (2,'Electronic'), (3,'Leather'); --non-exaustive list
CREATE TABLE in_prgoress_command.status
(
id int primary key
, designation varchar (45) NOT NULL
)
TABLESPACE in_progress_command_tbs;
INSERT INTO in_prgoress_command.status
VALUES (1,'Shipping'), (2,'Cancel'), (3,'Terminated'), (4,'Payed'), (5,'Initialised'); --non-exaustive list
CREATE TABLE in_prgoress_command.command_category
(
id bigserial primary key
, idCategory int
, idCommand bigint
)
TABLESPACE in_progress_command_tbs;
ALTER TABLE in_prgoress_command.command_category
ADD CONSTRAINT fk_command_category_category FOREIGN KEY (idCategory) REFERENCES in_prgoress_command.category(id);
ALTER TABLE in_prgoress_command.command_category
ADD CONSTRAINT fk_command_category_command FOREIGN KEY (idCommand) REFERENCES in_prgoress_command.command(id);
CREATE INDEX idx_command_category_category ON in_prgoress_command.command_category USING BTREE (idCategory) TABLESPACE command_indexes_tbs;
CREATE INDEX idx_command_category_command ON in_prgoress_command.command_category USING BTREE (idCommand) TABLESPACE command_indexes_tbs;
CREATE TABLE archived_command.command_category
(
id bigserial primary key
, idCategory int
, idCommand bigint
)
TABLESPACE archived_command_tbs;
ALTER TABLE archived_command.command_category
ADD CONSTRAINT fk_command_category_category FOREIGN KEY (idCategory) REFERENCES in_prgoress_command.category(id);
ALTER TABLE archived_command.command_category
ADD CONSTRAINT fk_command_category_command FOREIGN KEY (idCommand) REFERENCES archived_command.command(id);
CREATE INDEX idx_command_category_category ON archived_command.command_category USING BTREE (idCategory) TABLESPACE command_indexes_tbs;
CREATE INDEX idx_command_category_command ON archived_command.command_category USING BTREE (idCommand) TABLESPACE command_indexes_tbs;
CREATE TABLE in_prgoress_command.command_status
(
id bigserial primary key
, idStatus int
, idCommand bigint
, change_timestamp timestamp --anticipate if you can the time-zone problematic
)
TABLESPACE in_progress_command_tbs;
ALTER TABLE in_prgoress_command.command_status
ADD CONSTRAINT fk_command_status_status FOREIGN KEY (idStatus) REFERENCES in_prgoress_command.status(id);
ALTER TABLE in_prgoress_command.command_status
ADD CONSTRAINT fk_command_status_command FOREIGN KEY (idCommand) REFERENCES in_prgoress_command.command(id);
CREATE INDEX idx_command_status_status ON in_prgoress_command.command_status USING BTREE (idStatus) TABLESPACE command_indexes_tbs;
CREATE INDEX idx_command_status_command ON in_prgoress_command.command_status USING BTREE (idCommand) TABLESPACE command_indexes_tbs;
CREATE UNIQUE INDEX idxu_command_state ON in_prgoress_command.command_status USING BTREE (change_timestamp, idStatus, idCommand) TABLESPACE command_indexes_tbs;
CREATE OR REPLACE FUNCTION sp_trg_archiving_command ()
RETURNS TRIGGER
language plpgsql
as $function$
DECLARE
BEGIN
-- Copy the data
INSERT INTO archived_command.command
SELECT *
FROM in_prgoress_command.command
WHERE new.idCommand = idCommand;
INSERT INTO archived_command.command_status (idStatus, idCommand, change_timestamp)
SELECT idStatus, idCommand, change_timestamp
FROM in_prgoress_command.command_status
WHERE idCommand = new.idCommand;
INSERT INTO archived_command.command_category (idCategory, idCommand)
SELECT idCategory, idCommand
FROM in_prgoress_command.command_category
WHERE idCommand = new.idCommand;
-- Delete the data
DELETE FROM in_prgoress_command.command_status
WHERE idCommand = new.idCommand;
DELETE FROM in_prgoress_command.command_category
WHERE idCommand = new.idCommand;
DELETE FROM in_prgoress_command.command
WHERE idCommand = new.idCommand;
END;
$function$;
DROP TRIGGER IF EXISTS t_trg_archiving_command ON in_prgoress_command.command_status;
CREATE TRIGGER t_trg_archiving_command
AFTER INSERT
ON in_prgoress_command.command_status
FOR EACH ROW
WHEN (new.idstatus = 2 or new.idStatus = 3)
EXECUTE PROCEDURE sp_trg_archiving_command();
CREATE TABLE archived_command.command_status
(
id bigserial primary key
, idStatus int
, idCommand bigint
, change_timestamp timestamp --anticipate if you can the time-zone problematic
)
TABLESPACE archived_command_tbs;
ALTER TABLE archived_command.command_status
ADD CONSTRAINT fk_command_command_status FOREIGN KEY (idStatus) REFERENCES in_prgoress_command.category(id);
ALTER TABLE archived_command.command_status
ADD CONSTRAINT fk_command_command_status FOREIGN KEY (idCommand) REFERENCES archived_command.command(id);
CREATE INDEX idx_command_status_status ON archived_command.command_status USING BTREE (idStatus) TABLESPACE command_indexes_tbs;
CREATE INDEX idx_command_status_command ON archived_command.command_status USING BTREE (idCommand) TABLESPACE command_indexes_tbs;
CREATE UNIQUE INDEX idxu_command_state ON archived_command.command_status USING BTREE (change_timestamp, idStatus, idCommand) TABLESPACE command_indexes_tbs;
Conclusion:
In many cases, when you are worried by the disposition of your keys it is because they are not in the good place. Same goes for cars ones! :D
Do not take any solution as prophetic solution : benchmark it.

Duplicating parent, child and grandchild records

I have a parent table that represents a document of-sorts, with each record in the table having n children records in a child table. Each child record can have n grandchild records. These records are in a published state. When the user wants to modify a published document, we need to clone the parent and all of its children and grandchildren.
The table structure looks like this:
Parent
CREATE TABLE [ql].[Quantlist] (
[QuantlistId] INT IDENTITY (1, 1) NOT NULL,
[StateId] INT NOT NULL,
[Title] VARCHAR (500) NOT NULL,
CONSTRAINT [PK_Quantlist] PRIMARY KEY CLUSTERED ([QuantlistId] ASC),
CONSTRAINT [FK_Quantlist_State] FOREIGN KEY ([StateId]) REFERENCES [ql].[State] ([StateId])
);
Child
CREATE TABLE [ql].[QuantlistAttribute]
(
[QuantlistAttributeId] INT IDENTITY (1, 1),
[QuantlistId] INT NOT NULL,
[Narrative] VARCHAR (500) NOT NULL,
CONSTRAINT [PK_QuantlistAttribute] PRIMARY KEY ([QuantlistAttributeId]),
CONSTRAINT [FK_QuantlistAttribute_QuantlistId] FOREIGN KEY ([QuantlistId]) REFERENCES [ql].[Quantlist]([QuantlistId]),
)
Grandchild
CREATE TABLE [ql].[AttributeReference]
(
[AttributeReferenceId] INT IDENTITY (1, 1),
[QuantlistAttributeId] INT NOT NULL,
[Reference] VARCHAR (250) NOT NULL,
CONSTRAINT [PK_QuantlistReference] PRIMARY KEY ([AttributeReferenceId]),
CONSTRAINT [FK_QuantlistReference_QuantlistAttribute] FOREIGN KEY ([QuantlistAttributeId]) REFERENCES [ql].[QuantlistAttribute]([QuantlistAttributeId]),
)
In my stored procedure, i pass in the QuantlistId I want to clone as #QuantlistId. Since the QuantlistAttribute table has a ForeignKey I can easily clone that as well.
INSERT INTO [ql].[Quantlist] (
[StateId],
[Title],
) SELECT
1,
Title,
FROM [ql].[Quantlist]
WHERE QuantlistId = #QuantlistId
SET #ClonedId = SCOPE_IDENTITY()
INSERT INTO ql.QuantlistAttribute(
QuantlistId
,Narrative)
SELECT
#ClonedId,
Narrative,
FROM ql.QuantlistAttribute
WHERE QuantlistId = #QuantlistId
The trouble comes down to the AttributeReference. If I cloned 30 QuantlistAttribute records, how do I clone the records in the reference table and match them up with the new records I just inserted in to the QuantlistAttribute table?
INSERT INTO ql.AttributeReference(
QuantlistAttributeId,
Reference,)
SELECT
QuantlistAttributeId,
Reference,
FROM ql.QuantlistReference
WHERE ??? I don't have a key to go off of for this.
I thought I could do this with some temporary linking tables that holds the old attribute id's along with the new attribute id's. I don't know how to go about inserting the old Attribute Id's in to a temp table along with their new ones. Inserting the existing Attributes, by QuantlistId, is easy enough, but I can't figure out how to make sure I link the correct new and old Id's together in some way, so that the AttributeReference table can be cloned right. If I could get the QuantlistAttribute new and old Id's linked, I could join on that temp table and figure out how to restore the relationship of the newly cloned references, to the newly cloned attributes.
Any help on this would be awesome. I've spent the last day and a half trying to figure this out with no luck :/
Please excuse some of the SQL inconsistencies. I re-wrote up the sql real quick, trimming out a lot of additional columns, related-tables and constraints that weren't needed for this question.
Edit
After doing a little digging around, I found that OUTPUT might be useful for this. Is there a way to use OUTPUT to map the QuantlistAttributeId records I just inserted, to the QuantlistAttributeId they originated from?
You can use OUTPUT to get the inserted rows.
You can insert the data into QuantlistAttribute based on the order of ORDER BY c.QuantlistAttributeId ASC
Have a temp table/table variable which 3 columns
an id identity column
new QuantlistAttributeId
old QuantlistAttributeId.
Use OUTPUT to insert new identity values of QuantlistAttribute into a temp table/table variable.
The new IDs are generated in the same order as c.QuantlistAttributeId
Use a row_number() ordered by QuantlistAttributeId to match the old QuantlistAttributeId and new QuantlistAttributeIds based on row_number() and id of the table variable and update the values or old QuantlistAttributeId in the table variable
Use the temp table and join with AttributeReference and insert records in one go.
Note:
ORDER BY during INSERT INTO SELECT and ROW_NUMBER() to get matching old QuantlistAttributeId is required because looking at your question, there seems to be no other logical key to map old and new records together.
Query for above Steps
DECLARE #ClonedId INT,#QuantlistId INT = 0
INSERT INTO [ql].[Quantlist] (
[StateId],
[Title]
) SELECT
1,
Title
FROM [ql].[Quantlist]
WHERE QuantlistId = #QuantlistId
SET #ClonedId = SCOPE_IDENTITY()
--Define a table variable to store the new QuantlistAttributeID and use it to map with the Old QuantlistAttributeID
DECLARE #temp TABLE(id int identity(1,1), newAttrID INT,oldAttrID INT)
INSERT INTO ql.QuantlistAttribute(
QuantlistId
,Narrative)
--New QuantlistAttributeId are created in the same order as old QuantlistAttributeId because of ORDER BY
OUTPUT inserted.QuantlistAttributeId,NULL INTO #temp
SELECT
#ClonedId,
Narrative
FROM ql.QuantlistAttribute c
WHERE QuantlistId = #QuantlistId
--This is required to keep new ids generated in the same order as old
ORDER BY c.QuantlistAttributeId ASC
;WITH CTE AS
(
SELECT c.QuantlistAttributeId,
--Use ROW_NUMBER to get matching id which is same as the one generated in #temp
ROW_NUMBER()OVER(ORDER BY c.QuantlistAttributeId ASC) id
FROM ql.QuantlistAttribute c
WHERE QuantlistId = #QuantlistId
)
--Update the old value in #temp
UPDATE T
SET oldAttrID = CTE.QuantlistAttributeId
FROM #temp T
INNER JOIN CTE ON T.id = CTE.id
INSERT INTO ql.AttributeReference(
QuantlistAttributeId,
Reference)
SELECT
T.NewAttrID,
Reference
FROM ql.AttributeReference R
--Use OldAttrID to join with ql.AttributeReference and insert NewAttrID
INNER JOIN #temp T
ON T.oldAttrID = R.QuantlistAttributeId
Hope this helps.

SQL Server: Extracting a Column Into a Table

I have a table with a column that I want to extract out and put into a separate table.
For example, lets say I have a table named Contacts. Contacts has a column named Name which stores a string. Now I want to pull out the names into another table named Name and link the Contact.Name column to the Id of the Name table.
I can only use SQL to do this. Any ideas on the best way to go about this?
Let me know if I can clarify anything, thanks!
[edit]
One problem is that different contacts can be tied to the same name. So when different contacts have the same name and it gets exported the Name table would only have one unique row for that name and all the contacts would point to that row. I guess this wouldn't make sense if I were actually working on a contact book, but I'm just using it to illustrate my problem.
CREATE TABLE Name (NameID int IDENTITY(1, 1), [Name] varchar(50))
INSERT INTO Name ([Name])
SELECT DISTINCT [Name]
FROM Contact
ALTER TABLE Contact
ADD COLUMN NameID int
UPDATE Contact
SET NameID = [Name].NameID
FROM Contact
INNER JOIN [Name]
ON Contact.[Name] = [Name].[Name]
ALTER TABLE Contact
DROP COLUMN [Name]
Then add foreign key constraint, etc.
Create the new table with a Foreign key that points back to the contact table. Then insert the names and contactids from the contact table into this new table. After that you can drop the "name" column from the contact table.
CREATE TABLE Name
(
ContactId int,
Name nvarchar(100)
);
INSERT Name(Name)
SELECT ContactId, Name From Contact;
ALTER TABLE Contact
DROP Column name;
EDIT: Since you have edited the question to mention that one name can be associated with multiple contacts, this changes things in the opposite way.
CREATE TABLE Name
(
NameId int IDENTITY,
Name nvarchar(100)
);
INSERT Name(Name)
SELECT DISTINCT Name From Contact;
ALTER TABLE Contact
ADD NameId int;
UPDATE c
SET c.NameId = n.NameId
FROM Contact c
JOIN Name n on n.Name = c.Name;
ALTER Table Contact
Drop Column Name;
NOTE: Make sure that you create the appropiate foreign key between the Contact and Name tables using the NameId on the Contact table and also create a UNIQUE constraint on the "name" column in the Name table.
insert into another_table( contact_id, name )
select id, name
from contacts;
insert into new_table (contact_id, name)
select min(id), name
from contacts
group by name;
This is one way of ensuring only one row per name - you can substitute other functions for min (like, for eg max).
I'm not too sure why you would want to do this, though. No matter what, you will end up with some contacts that don't have a name linked to them...
ALTER TABLE `Contacts` ADD `name_id` INT( 12 ) NOT NULL
ALTER TABLE `Name` ADD `Name` VARCHAR( 200 ) NOT NULL
INSERT INTO Name (id, name) SELECT id, Name FROM Contacts
ALTER TABLE `Contacts` DROP `Name`
The problem is the name_id field, which is filles with "0" and should be have the same value as the id in the Contacts-Table. Here you can use the LOOP or ITERATE statement (if you using MySQL).