ORACLE SQL - Creating a disjoint specialisation - sql

The following is what I am expecting from my tables. An item can be disjointed into a 'Computer' or 'Clothes'.
Computer has attributes 'computer_type', & 'operating_system'.
Clothes has only 'clothes_type'. As I want these as sub types they'll both take the 'item_id' as a foreign key.
My plan is to have these two sub tables derived from the parent item, and insert (the imported data that I am working on) into the appropriate tables. The 'item' table stores all the 'items' but the sub tables will reference a given item ID and be populated.
Table: Item (item_id,)
table: computer (sub type) with its own attributes
table: clothes (sub type) with its own attributes.
I saw another post here:
How to write Tables in SQL with a Disjoint Connection
So I wondered (as my approach is similar to his) whether I should create the tables as they are with a foreign key within both of my disjoined tables, and use a 'select' accordingly. Or is there a way without having seperate tables to store individual attributes.

"Or is there a way without having seperate tables to store individual attributes."
There is no good way to avoid having separate tables. Bad ways include using a so-called generic data model (the entity-attribute-value pattern) or some form of quasi-structured storage such as JSON or XML.
These approaches are bad for several reasons.
It is very hard to understand the data model. The model is
embedded how the data is stored rather than being expressed in
tables and keys.
We cannot enforce keys and data integrity constraints on such
models.
It is harder to write queries against such "models" because SQL is a strongly-typed language suited to static data models. The time saved by not doing proper data modelling is lost in writing the convoluted queries needed to handle the poor quality data (because of the previous point).
A robust solution would enforce the arc across the subtype tables with the full range of keys.
create table items
(item_id number not null primary key
, item_type varchar2(10) not null
, item_name varchar2(30) not null
, constraint item_uk unique (item_id, item_type)
, constraint item_type_ck check (item_type in ('CLOTHES', 'COMPUTER'))
);
Two keys? Why two keys? So we can enforce the one-to-one relationship between parent and child records:
create table clothes
(item_id number not null primary key
, item_type varchar2(10) not null
, size not null varchar2(5)
, colour not null varchar2(5)
, category not null varchar2(5)
, constraint clothes_type_ck check (item_type = 'CLOTHES')
, constraint clothes_item_fk foreign key (item_id, item_type)
references items (item_id, item_type)
);
create table computers
(item_id number not null primary key
, item_type varchar2(10) not null
, ram_gb not null number
, storage_gb not null number
, os not null varchar2(10)
, constraint computers_type_ck check (item_type = 'COMPUTER')
, constraint computers_item_fk foreign key (item_id, item_type)
references items (item_id, item_type)
);
The primary key on ITEMS ensures we only have one record for a given item_id. The unique key provides a reference point for the child tables, so we cannot have a COMPUTERS records and a CLOTHES record pointing to the same ITEMS record.

You can try and do this with 3 tables, a CHECK constraint (for the "supertype" table), and foreign key constraints. However, a good alternative may be the object-relational approach: create 3 TYPEs, and only one table that can hold all the data.
Example (Oracle 12c)
-- supertype
create or replace type item_t as object (
name varchar2(64)
)
not final; -- subtypes can be derived
/
-- subtypes
create or replace type computer_t under item_t (
operating_system varchar2(128)
)
/
create or replace type clothes_t under item_t (
description varchar2(128)
)
/
-- Table that handles all 3 types. No foreign key constraints or CHECKS needed.
create table items (
id number generated always as identity primary key
, item item_t
);
Testing: INSERTs
begin
-- "standard" INSERTs
insert into items ( item )
values ( computer_t( 'iMac', 'Mac OS' ) );
insert into items ( item )
values ( clothes_t( 'coat', 'black' ) );
-- object of SUPERtype can be inserted
insert into items ( item )
values ( item_t( 'supertype' ) );
end;
/
-- Unknown types cannot be inserted. Good.
insert into items ( item )
values ( unknown_t( 'type unknown!', 'not available', 999 ) );
-- ORA-00904: "UNKNOWN_T": invalid identifier
insert into items ( item )
values ( item_t( 'supertype', 'not available') );
-- ORA-02315: incorrect number of arguments for default constructor
For queries, use the appropriate functions. A simple SELECT * FROM items will probably not give you the results you need eg
SQL> select * from items;
ID ITEM
1 oracle.sql.STRUCT#2d901eb0
2 oracle.sql.STRUCT#3ba987b8
3 oracle.sql.STRUCT#3f191845
Query
select
id
, treat( item as computer_t ).name as computer_name
, treat( item as clothes_t ).name as garment_name
, treat( item as item_t ).name as item_name
, case
when treat( item as item_t ) is not null
then 'ITEM_T'
when treat( item as computer_t ) is not null -- "is an" item
then 'COMPUTER_T'
when treat( item as clothes_t ) is not null -- "is an" item
then 'CLOTHES_T'
else
'TYPE unknown :-|'
end type_
, case
when treat( item as computer_t ) is not null
then 'COMPUTER_T'
when treat( item as clothes_t ) is not null
then 'CLOTHES_T'
when treat( item as item_t ) is not null
then 'ITEM_T'
else
'TYPE unknown :-|'
end type_
from items ;
-- result
ID COMPUTER_NAME GARMENT_NAME ITEM_NAME TYPE_ TYPE_
1 iMac NULL iMac ITEM_T COMPUTER_T
2 NULL coat coat ITEM_T CLOTHES_T
3 NULL NULL supertype ITEM_T ITEM_T

Related

Directionless relationship failing in PostgreSQL

I am trying to create a 2-way relationship table in PostgreSQL for my 3 objects. This idea has stemmed from the following question https://dba.stackexchange.com/questions/48568/how-to-relate-two-rows-in-the-same-table where I also want to store the relationship and its reverse between rows.
For context on my database: Object 1 which contains (aka relates to many) object2s. In turn, these object2s also relate to many object3s. A 1-to-many relationship (object 1 to object 2) and many-to-many relationship (object 2 to object 3)
Each of the objects have been assigned a UUID in other tables which contain info regarding them. Based on their UUID's I want to be able to query them and get the associated objects UUID as well. This in turn will show me the associations and direct me as to which object I should be looking at for location, info, etc just by knowing the UUID.
PLEASE NOTE - THAT ONE BOX MAY HAVE A RELTIONSHIP OF 10 SLOTS. THEREFORE THAT ONE UUID ASSIGNED FOR THE BOX WILL APPEAR IN MY UUID1 COLUMN 10 TIMES!! THIS IS A MUST!
My next step was to try and create a directionless relationship using this query:
CREATE TABLE bridge_x
(uuid1 UUID NOT NULL REFERENCES temp (uuid1), uuid2 UUID NOT NULL REFERENCES temp (uuid2),
PRIMARY KEY(uuid1, uuid2),
CONSTRAINT temp_temp_directionless
FOREIGN KEY (uuid2, uuid1)
REFERENCES bridge_x (uuid1, uuid2)
);
Is there any other way I can store ALL the information mentioned and be able to query the UUID in order to see the relationship between the objects?
You'll need a composite primary key in the bridge table. An example, using polygameous marriages:
CREATE TABLE person
(person_id INTEGER NOT NULL PRIMARY KEY
, name varchar NOT NULL
);
CREATE TABLE marriage
( person1 INTEGER NOT NULL
, person2 INTEGER NOT NULL
, comment varchar
, CONSTRAINT marriage_1 FOREIGN KEY (person1) REFERENCES person(person_id)
, CONSTRAINT marriage_2 FOREIGN KEY (person2) REFERENCES person(person_id)
, CONSTRAINT order_in_court CHECK (person1 < person2)
, CONSTRAINT polygamy_allowed UNIQUE (person1,person2)
);
INSERT INTO person(person_id,name) values (1,'Bob'),(2,'Alice'),(3,'Charles');
INSERT INTO marriage(person1,person2, comment) VALUES(1,2, 'Crypto marriage!') ; -- Ok
INSERT INTO marriage(person1,person2, comment) VALUES(2,1, 'Not twice!' ) ; -- Should fail
INSERT INTO marriage(person1,person2, comment) VALUES(3,3, 'No you dont...' ) ; -- Should fail
INSERT INTO marriage(person1,person2, comment) VALUES(2,3, 'OMG she did it again.' ) ; -- Should fail (does not)
INSERT INTO marriage(person1,person2, comment) VALUES(3,4, 'Non existant persons are not allowed to marry !' ) ; -- Should fail
SELECT p1.name, p2.name, m.comment
FROM marriage m
JOIN person p1 ON m.person1 = p1.person_id
JOIN person p2 ON m.person2 = p2.person_id
;
Result:
CREATE TABLE
CREATE TABLE
INSERT 0 3
INSERT 0 1
ERROR: new row for relation "marriage" violates check constraint "order_in_court"
DETAIL: Failing row contains (2, 1, Not twice!).
ERROR: new row for relation "marriage" violates check constraint "order_in_court"
DETAIL: Failing row contains (3, 3, No you dont...).
INSERT 0 1
ERROR: insert or update on table "marriage" violates foreign key constraint "marriage_2"
DETAIL: Key (person2)=(4) is not present in table "person".
name | name | comment
-------+---------+-----------------------
Bob | Alice | Crypto marriage!
Alice | Charles | OMG she did it again.
(2 rows)

SQL set constraint on how many times a PK can be referenced

I'm building a demo database of zoo for my school project and I've encountered following problem: I have a table Pavilion, which has some primary key id_pavilion and column capacity (this is information about what is the highest number of animals which can live in this pavilion).
Let's say that each pavilion can contain 2 animals at maximum.
Pavilion
id_pavilion capacity
-----------------------
1 2
2 2
3 2
4 2
Animal
id_an-column2-column3 id_pavilion
---------------------------------------
1 2
2 2
3 2
4 2
(This shows what I'm trying to prevent)
Then I have table animal, which contains some information about the animal and mainly the id_pavilion from Pavilion as a foreign key.
My question is: how can I add such a constraint that the PK id_pavilion from Pavilion can be referenced in table Animal only so many times as the capacity allows?
Looking at your example data, one could argue that every PAVILION can accommodate 2 animals, right? One could also say that the "accommodations" need to be in place before the animals can be kept in an appropriate manner. Thus, we could create a table called ACCOMMODATION, listing all available spaces.
create table pavilion( id primary key, capacity )
as
select level, 2 from dual connect by level <= 4 ;
create table accommodation(
id number generated always as identity start with 1000 primary key
, pavilionid number references pavilion( id )
) ;
Generate all accommodations
-- No "human intervention" here.
-- Only the available spaces will be INSERTed.
insert into accommodation ( pavilionid )
select id
from pavilion P1, lateral (
select 1
from dual
connect by level <= ( select capacity from pavilion where id = P1.id )
) ;
-- we can accommodate 8 animals ...
select count(*) from accommodation ;
COUNT(*)
----------
8
-- accommodations and pavilions
SQL> select * from accommodation ;
ID PAVILIONID
---------- ----------
1000 1
1001 1
1002 2
1003 2
1004 3
1005 3
1006 4
1007 4
8 rows selected.
Each animal should be in a single (defined) location. When an animal is "added" to the zoo, it can only (physically) be in a single location/accommodation. We can use a UNIQUE key and a FOREIGN key (referencing ACCOMMODATION) to enforce this.
-- the ANIMAL table will have more columns eg GENUS, SPECIES, NAME etc
create table animal(
id number generated always as identity start with 2000
-- , name varchar2( 64 )
, accommodation number
) ;
alter table animal
add (
constraint animal_pk primary key( id )
, constraint accommodation_unique unique( accommodation )
, constraint accommodation_fk
foreign key( accommodation ) references accommodation( id )
);
Testing
-- INSERTs will also affect the columns GENUS, SPECIES, NAME etc
-- when the final version of the ANIMAL table is in place.
insert into animal( accommodation ) values ( 1001 ) ;
SQL> insert into animal( accommodation ) values ( 1000 ) ;
1 row inserted.
SQL> insert into animal( accommodation ) values ( 1001 ) ;
1 row inserted.
-- trying to INSERT into the same location again
-- MUST fail (due to the unique constraint)
SQL> insert into animal( accommodation ) values ( 1000 );
Error starting at line : 1 in command -
insert into animal( accommodation ) values ( 1000 )
Error report -
ORA-00001: unique constraint (...ACCOMMODATION_UNIQUE) violated
SQL> insert into animal( accommodation ) values ( 1001 );
Error starting at line : 1 in command -
insert into animal( accommodation ) values ( 1001 )
Error report -
ORA-00001: unique constraint (...ACCOMMODATION_UNIQUE) violated
-- trying to INSERT into a location that does not exist
-- MUST fail (due to the foreign key constraint)
SQL> insert into animal( accommodation ) values ( 9999 ) ;
Error starting at line : 1 in command -
insert into animal( accommodation ) values ( 9999 )
Error report -
ORA-02291: integrity constraint (...ACCOMMODATION_FK) violated - parent key not found
Animals and accommodations
select
A.id as animal
, P.id as pavilion
, AC.id as location --(accommodation)
from pavilion P
join accommodation AC on P.id = AC.pavilionid
join animal A on AC.id = A.accommodation
;
ANIMAL PAVILION LOCATION
---------- ---------- ----------
2000 1 1000
2001 1 1001
DBfiddle here. Tested with Oracle 12c and 18c. (You'll need version 12c+ for LATERAL join to work.)
What you are trying to enforce at the database level is more of a 'business logic' rule rather than a hard data constraint. You can not implement it directly in your table designs; even if you could (as #serg mentions in the comments) it would require a very expensive (in terms of CPU/resources) lock on the table to perform the counting.
Another option, that would achieve your goal and keep the business logic separate from the data design, is to use a SQL Trigger.
A trigger can run before the data is inserted into your table; here you can check how many rows have already been inserted for that 'pavilion entity' and abort or allow the insert.
A comment for the "school project" side of things:
This being said, the sort of logic you are talking about is much better served within your consuming application rather than the database (my opinion, others may disagree). Also perhaps think about defining the size limit in the data, so you can have different sized pavilions.
Notes:
For anyone visiting this question in the future, the above link is for an oracle trigger (as OP has tagged the question for oracle). This link is for Microsoft SQL Server Triggers.
The answer is "not easily". Although the idea of keeping the "accommodations" in the pavilions as a separate table is a clever one, animals are put into pavilions, not accommodations. Modeling accommodations makes it much trickier to move animals around.
Perhaps the simplest approach is to use triggers. This starts with an animal_count column in pavilions. This column starts at zero and is incremented or decremented as animals move in or out. You can use a check constraint to validate that the pavilion is not over-capacity.
Unfortunately, maintaining this column requires triggers on the animals table, one for insert, update, and delete.
In the end, the trigger is maintaining the count and if you attempt to put an animal in a full pavilion, you will violate the check constraint.
You need a column (say "NrOccupants") that is updated when an animal is placed into or removed from each pavilion. Then you add a check constraint to that column that prevents the application code from adding more animals to a pavilion than is permitted by the rule that is enforced by the check constraint.
Here is an example of the SQL DDL that would do that.
CREATE SCHEMA Pavilion
GO
CREATE TABLE Pavilion.Pavilion
(
pavilionNr int NOT NULL,
capacity tinyint CHECK (capacity IN (2)) NOT NULL,
nrOccupants tinyint CHECK (nrOccupants IN (0, 2)) NOT NULL,
CONSTRAINT Pavilion_PK PRIMARY KEY(pavilionNr)
)
GO
CREATE TABLE Pavilion.Animal
(
animalNr int NOT NULL,
name nchar(50) NOT NULL,
pavilionNr int NOT NULL,
type nchar(50) NOT NULL,
weight smallint NOT NULL,
CONSTRAINT Animal_PK PRIMARY KEY(animalNr)
)
GO
ALTER TABLE Pavilion.Animal ADD CONSTRAINT Animal_FK FOREIGN KEY (pavilionNr) REFERENCES Pavilion.Pavilion (pavilionNr) ON DELETE NO ACTION ON UPDATE NO ACTION
GO

Can I use a PL/SQL trigger to check category and concat to a incremental number based on category?

I got a productID - P0001KTC and P0001DR.
If product category is kitchen, I will assign a productID - PROD001KTC, else if the category is dining room, then the productID should be PROD001DR.
Is it possible to write a sequence inside a trigger to check the product category and assign an id as mentioned above?
if there is another living room category product inserted then the id will be PROD001LR.
Kitchen - PROD001KTC,PROD002KTC...
Dining Room - PROD001DR,PROD002DR....
Living Room - PROD001LR,PROD002LR...
P0001KTC is the sort of smart key users love and developers hate. But the customer is king, so here we are.
The customer's requirement is to increment the numeric element within the product category, so that the same number is used for different categories: P0001KTC , P0001DR , P0002KTC, P0001LR, P0002LR, etc. A monotonically increasing sequence cannot do this.
The best implementation is a code control table, that is a table to manage the assigned numbers. Such an approach entails pessimistic locking, which serializes access to a Product Category (e.g. KTC). Presumably the users won't be creating new Products very often, so the scaling implications aren't severe.
Working PoC
Here's our reference table:
create table product_categories (
product_category_code varchar2(3) not null
, category_description varchar2(30) not null
, constraint product_categories_pk primary key (product_category_code)
)
/
create table product_ids (
product_category_code varchar2(3) not null
, last_number number(38) default 0 not null
, constraint product_ids_pk primary key (product_category_code)
, constraint product_ids_categories_fk foreign key (product_category_code)
references product_categories (product_category_code)
) organization index
/
May these two tables could be one table, but this implementation offers greater flexibility. Let's create our Product Categories:
insert all
into product_categories (product_category_code, category_description)
values (cd, descr)
into product_ids (product_category_code)
values (cd)
select * from
( select 'KTC' as cd, 'Kitchen' as descr from dual union all
select 'LR' as cd, 'Living Room' as descr from dual union all
select 'DR' as cd, 'Dining Room' as descr from dual )
/
Here's the target table:
create table products (
product_id varchar2(10) not null
, product_category_code varchar2(3) not null
, product_description varchar2(30) not null
, constraint products_pk primary key (product_id)
, constraint products_fk foreign key (product_category_code)
references product_categories (product_category_code)
)
/
This function is where the magic happens. The function formats the new Product ID. It does this by taking out a pre-emptive lock on the row for the assigned Category. These locks are retained for the length of the transaction i.e. until the locking session commits or rolls back. So if there are two users creating Kitchen Products one will be left hanging on the other: this is why we generally try to avoid serializing table access in multi-user environments.
create or replace function get_product_id
( p_category_code in product_categories.product_category_code%type)
return products.product_id%type
is
cursor lcur (p_code varchar2)is
select last_number + 1
from product_ids
where product_category_code = p_code
for update of last_number;
next_number product_ids.last_number%type;
return_value products.product_id%type;
begin
open lcur( p_category_code);
fetch lcur into next_number;
if next_number > 999 then
raise_application_error (-20000
, 'No more numbers available for ' || p_category_code);
else
return_value := 'PROD' || lpad(next_number, 3, '0') || p_category_code;
end if;
update product_ids t
set t.last_number = next_number
where current of lcur;
close lcur;
return return_value;
end get_product_id;
/
And here's the trigger:
create or replace trigger products_ins_trg
before insert on products
for each row
begin
:new.product_id := get_product_id (:new.product_category_code);
end;
/
Obviously, we could put the function code in the trigger body but it's good practice to keep business logic out of triggers.
Lastly, here's some test data...
insert into products ( product_category_code, product_description)
values ('KTC', 'Refrigerator')
/
insert into products ( product_category_code, product_description)
values ('DR', 'Dining table')
/
insert into products ( product_category_code, product_description)
values ('KTC', 'Microwave oven')
/
insert into products ( product_category_code, product_description)
values ('DR', 'Dining chair')
/
insert into products ( product_category_code, product_description)
values ('DR', 'Hostess trolley')
/
insert into products ( product_category_code, product_description)
values ('LR', 'Sofa')
/
And, lo!
SQL> select * from products
2 /
PRODUCT_ID PRO PRODUCT_DESCRIPTION
---------- --- ------------------------------
PROD001KTC KTC Refrigerator
PROD001DR DR Dining table
PROD002KTC KTC Microwave oven
PROD002DR DR Dining chair
PROD003DR DR Hostess trolley
PROD001LR LR Sofa
6 rows selected.
SQL>
Note that modelling the smart key as a single column is a bad idea. It is better to build it as a composite key, say unique (product_category, product_number), where product_number is generated from the code control table above. We still need the product_id for display purposes, but it should be derived from the underlying columns. This is easy using virtual columns, like this:
create table products (
product_id varchar2(10)
generated always as 'PROD' || to_char(product_no,'FM003') || product_category_code;
, product_category_code varchar2(3) not null
, product_no number not null
, product_description varchar2(30) not null
, constraint products_pk primary key (product_id)
, constraint products_uk unique (product_category_code, product_no)
, constraint products_fk foreign key (product_category_code)
references product_categories (product_category_code)
)
/
The image shows a different format for productid than what you write in the question. I will assume you want the "PROD" prefix as in the image, and that you can deal with changing those characters in the solution below, if needed.
Also, you write twice the same number (001) in the question, yet in the image, and in the comments you provided, you indicate the numbering should increment always. So this solution will have an always incrementing number.
Proposed Solution
You should store the incremental number separately, and have that as the real id.
The formatted productID could then be a derived column. Since Oracle 11g R1 you can create virtual columns in a table, so you don't really need a trigger for that:
Here is an example script, which creates the table and the sequence:
create table products (
id number not null,
category varchar2(100),
productid as (
'PROD'||
to_char(id, 'FM000') ||
case category when 'Kitchen' then 'KTC'
when 'LivingRoom' then 'LR'
else '???'
end ) virtual,
constraint pk_product_id primary key (id),
);
-- create sequence for inserting incremental id value
create sequence product_seq start with 1 increment by 1;
You insert data like this, without specifying values for the virtual productid column:
-- Insert data
insert into products (id, category) values (product_seq.nextval, 'Kitchen');
insert into products (id, category) values (product_seq.nextval, 'LivingRoom');
And when you select data from the table:
select * from products
You get:
ID | CATEGORY | PRODUCTID
---+------------+-----------
1 | Kitchen | PROD001KTC
2 | LivingRoom | PROD002LR
Note that you'll get into trouble if your id surpasses 999, as then the 3-digit format will not work any more. Oracle will then generate ### for the to_char result, so you'll run into duplicate productid values soon.
If you have many more categories than those two (Kitchen & LivingRoom), then you should not extend the earlier mentioned case statement with those values. Instead you should create a reference table for it (let's call it categories), with values like this:
Code | Name
-----+---------------
KTC | Kitchen
LR | Living Room
... | ...
Once you have that table, where Code should be unique, you can just store the code in the products table, not the description:
create table products (
id number not null,
category_code varchar2(10),
productid as (
'PROD'||
to_char(id, 'FM000') ||
category_code) virtual,
constraint pk_product_id primary key (id),
constraint fk_product_category foreign key (category_code)
references catgories(code)
);
You would insert values like this:
insert into products (id, category_code) values (product_seq.nextval, 'KTC');
insert into products (id, category_code) values (product_seq.nextval, 'LR');
And when you want to select data from the table with the category names included:
select product.productid, categories.name
from products
inner join categories on product.category_code = categories.code

Moving table columns to new table and referencing as foreign key in PostgreSQL

Suppose we have a DB table with fields
"id", "category", "subcategory", "brand", "name", "description", etc.
What's a good way of creating separate tables for
category, subcategory and brand
and the corresponding columns and rows in the original table becoming foreign key references?
To outline the operations involved:
get all unique values in each column of the original table which should become foreign keys;
create tables for those
create foreign key reference columns in the original table (or a copy)
In this case, the PostgreSQL DB is accessed via Sequel in a Ruby app, so available interfaces are the command line, Sequel, PGAdmin, etc...
The question: how would you do this?
-- Some test data
CREATE TABLE animals
( id SERIAL NOT NULL PRIMARY KEY
, name varchar
, category varchar
, subcategory varchar
);
INSERT INTO animals(name, category, subcategory) VALUES
( 'Chimpanzee' , 'mammals', 'apes' )
,( 'Urang Utang' , 'mammals', 'apes' )
,( 'Homo Sapiens' , 'mammals', 'apes' )
,( 'Mouse' , 'mammals', 'rodents' )
,( 'Rat' , 'mammals', 'rodents' )
;
-- [empty] table to contain the "squeezed out" domain
CREATE TABLE categories
( id SERIAL NOT NULL PRIMARY KEY
, category varchar
, subcategory varchar
, UNIQUE (category,subcategory)
);
-- The original table needs a "link" to the new table
ALTER TABLE animals
ADD column category_id INTEGER -- NOT NULL
REFERENCES categories(id)
;
-- FK constraints are helped a lot by a supportive index.
CREATE INDEX animals_categories_fk ON animals (category_id);
-- Chained query to:
-- * populate the domain table
-- * initialize the FK column in the original table
WITH ins AS (
INSERT INTO categories(category, subcategory)
SELECT DISTINCT a.category, a.subcategory
FROM animals a
RETURNING *
)
UPDATE animals ani
SET category_id = ins.id
FROM ins
WHERE ins.category = ani.category
AND ins.subcategory = ani.subcategory
;
-- Now that we have the FK pointing to the new table,
-- we can drop the redundant columns.
ALTER TABLE animals DROP COLUMN category, DROP COLUMN subcategory;
-- show it to the world
SELECT a.*
, c.category, c.subcategory
FROM animals a
JOIN categories c ON c.id = a.category_id
;
Note: the fragment:
WHERE ins.category = ani.category
AND ins.subcategory = ani.subcategory
will lead to problems if these columns contain NULLs.
It would be better to compare them using
(ins.category,ins.subcategory)
IS NOT DISTINCT FROM
(ani.category,ani.subcategory)
I'm not sure I completely understand your question, if this doesn't seem to answer it, then please leave a comment and possibly improve your question to clarify, but it sounds like you want to do a CREATE TABLE xxx AS. For example:
CREATE TABLE category AS (SELECT DISTINCT(category) AS id FROM parent_table);
Then alter the parent_table to add a foreign key constraint.
ALTER TABLE parent_table ADD CONSTRAINT category_fk FOREIGN KEY (category) REFERENCES category (id);
Repeat this for each table you want to create.
Here is the related documentation:
CREATE TABLE
ALTER TABLE
Note: code and references are for Postgresql 9.4

how to create a Foreign-Key constraint to a subset of the rows of a table?

I have a reference table, say OrderType that collects different types of orders:
CREATE TABLE IF NOT EXISTS OrderType (name VARCHAR);
ALTER TABLE OrderType ADD PRIMARY KEY (name);
INSERT INTO OrderType(name) VALUES('sale-order-type-1');
INSERT INTO OrderType(name) VALUES('sale-order-type-2');
INSERT INTO OrderType(name) VALUES('buy-order-type-1');
INSERT INTO OrderType(name) VALUES('buy-order-type-2');
I wish to create a FK constraint from another table, say SaleInformation, pointing to that table (OrderType). However, I am trying to express that not all rows of OrderType are eligible for the purposes of that FK (it should only be sale-related order types).
I thought about creating a view of table OrderType with just the right kind of rows (view SaleOrderType) and adding a FK constraint to that view, but PostgreSQL balks at that with:
ERROR: referenced relation "SaleOrderType" is not a table
So it seems I am unable to create a FK constraint to a view (why?). Am I only left with the option of creating a redundant table to hold the sale-related order types? The alternative would be to simply allow the FK to point to the original table, but then I am not really expressing the constraint as strictly as I would like to.
I think your schema should be something like this
create table order_nature (
nature_id int primary key,
description text
);
insert into order_nature (nature_id, description)
values (1, 'sale'), (2, 'buy')
;
create table order_type (
type_id int primary key,
description text
);
insert into order_type (type_id, description)
values (1, 'type 1'), (2, 'type 2')
;
create table order_nature_type (
nature_id int references order_nature (nature_id),
type_id int references order_type (type_id),
primary key (nature_id, type_id)
);
insert into order_nature_type (nature_id, type_id)
values (1, 1), (1, 2), (2, 1), (2, 2)
;
create table sale_information (
nature_id int default 1 check (nature_id = 1),
type_id int,
foreign key (nature_id, type_id) references order_nature_type (nature_id, type_id)
);
If the foreign key clause would also accept an expression the sale information could omit the nature_id column
create table sale_information (
type_id int,
foreign key (1, type_id) references order_nature_type (nature_id, type_id)
);
Notice the 1 in the foreign key
You could use an FK to OrderType to ensure referential integrity and a separate CHECK constraint to limit the order types.
If your OrderType values really are that structured then a simple CHECK like this would suffice:
check (c ~ '^sale-order-type-')
where c is order type column in SaleInformation
If the types aren't structured that way in reality, then you could add some sort of type flag to OrderType (say a boolean is_sales column), write a function which uses that flag to determine if an order type is a sales order:
create or replace function is_sales_order_type(text ot) returns boolean as $$
select exists (select 1 from OrderType where name = ot and is_sales);
$$ language sql
and then use that in your CHECK:
check(is_sales_order_type(c))
You don't of course have to use a boolean is_sales flag, you could have more structure than that, is_sales is just for illustrative purposes.