Storing arbitrary attributes on tables - sql

I have 3 tables, x, y, and z. I want to be able to attach arbitrary
attributes to each row in each table. x, y, and z have nothing in
common other than the fact that they all have an integer primary key called
id and should be able to have arbitrary attributes attached to them.
Is it better to make a single attributes table, like
create table attributes (
table enum('x', 'y', 'z'),
xyz_id integer,
name varchar(50),
value text,
primary key (table, xyz_id, name)
);
Or is it best to make separate tables, like
create table x_attributes (
x_id integer,
name varchar(50),
value text,
primary key (x_id, name),
foreign key (x_id) references x (id)
);
create table y_attributes (...);
create table z_attributes (...);
The second option (separate tables) seems to be cleaner, but requires a lot
more boilerplate on both the database side and the application side.
I'm also open to suggestions other than those two.
Note: I've considered the possibility of using a document store like MongoDB, but
the data I'm working with is fundamentally relational.

Go with one table with an enum column, it will make grabbing all of the attributes for each row easier in the long run.

Related

Oracle SQL: "GENERATED ALWAYS" with a specified sequence

I have two tables that I would like to let them share the same sequence to populate the primary key ID column. However, I also don't want the user to specify or change the value for the ID column.
By using the code below, I can let two tables share the same sequence.
CREATE TABLE T1
(
ID INTEGER DEFAULT SEQ_1.nextval NOT NULL
);
This code will use its own sequence and prevent users from changing or specifying with INSERT:
CREATE TABLE T1
(
ID INTEGER GENERATED ALWAYS AS IDENTITY NOT NULL
);
Is there a way that can both world? Something like this:
CREATE TABLE T1
(
ID INTEGER GENERATED ALWAYS AS ( SEQ_1.nextval ) NOT NULL
);
Regarding the use case, as #Sujitmohanty30 asked, the reason that I raised this question:
I'm thinking to implement inheritance in the database, consider this UML diagram (I can't directly post images due to insufficient reputation, and sorry for being lack of imagination).
ANIMAL is abstract and all inheritance is mandatory. This means no instance of ANIMAL should be created. Furthermore, there is an one-to-many relationship between ANIMAL and ZOO_KEEPER.
Therefore, I came up with this idea:
CREATE SEQUENCE ANIMAL_ID_SEQ;
CREATE TABLE HORSE
(
ID INT DEFAULT ANIMAL_ID_SEQ.nextval NOT NULL PRIMARY KEY,
HEIGHT DECIMAL(3, 2) NOT NULL
);
CREATE TABLE DOLPHIN
(
ID INT DEFAULT ANIMAL_ID_SEQ.nextval NOT NULL PRIMARY KEY,
LENGTH DECIMAL(3, 2) NOT NULL
);
CREATE MATERIALIZED VIEW LOG ON HORSE WITH ROWID;
CREATE MATERIALIZED VIEW LOG ON DOLPHIN WITH ROWID;
CREATE MATERIALIZED VIEW ANIMAL
REFRESH FAST ON COMMIT
AS
SELECT 'horse' AS TYPE, ROWID AS RID, ID -- TYPE column is used as a UNION ALL marker
FROM HORSE
UNION ALL
SELECT 'dolphin' AS TYPE, ROWID AS RID, ID
FROM DOLPHIN;
ALTER TABLE ANIMAL
ADD CONSTRAINT ANIMAL_PK PRIMARY KEY (ID);
CREATE TABLE ZOO_KEEPER
(
NAME VARCHAR(50) NOT NULL PRIMARY KEY,
ANIMAL_ID INT NOT NULL REFERENCES ANIMAL (ID)
);
In this case, the use of the shared sequence is to avoid collision in ANIMAL mview. It uses DEFAULT to get the next ID of the shared sequence. However, using DEFAULT doesn't prevent users from manually INSERTing the ID field or UPDATE the value of it.
You can create a master view/table and generate the sequence in it.
Then copy it as column values into both tables while inserting.
Another option could be inserting into both tables at same time.Use SEQ.NEXTVAL to insert into first table to get a new ID, and then SEQ.CURRVAL to copy same id in the table.
No, you cant have anything like this because ID is independently generated for each of the tables and this can be done only using sequence when you are inserting the data in both the tables at the same time.
You should normalize your data schema: add column animal_type into the table and create composite primary key on both columns

SQL migration - move data to another table and get the primary key

Problem
Suppose a table like this:
CREATE TABLE parent (
parent_id INTEGER PRIMARY KEY,
col1 REAL,
col2 REAL
);
Then the requirements of the system change and parent need to have the information col1 and col2 at two different points in time. One possible design would be the creation of a separate table for col1 and col2:
CREATE TABLE child (
child_id INTEGER PRIMARY KEY,
col1 REAL,
col2 REAL
);
CREATE TABLE parent (
parent_id INTEGER PRIMARY KEY,
current_child INTEGER,
previous_child INTEGER,
FOREIGN KEY (current_child) REFERENCES child (child_id),
FOREIGN KEY (previous_child) REFERENCES child (child_id)
);
Questions
Is there a way to create a migration SQL script to move the data from the original parent table to child and set the correct foreign key at current_child?
previous_child could stay empty initially, this is not a problem.
I'm not concerned yet about removing the columns col1 and col2 from parent.
I'm avoiding to use auxiliary scripts in other languages (like in Python) to make the migration process simpler.
Is there a better design alternative?
I could have simply added two new columns previous_col1 and previous_col2, but I have many more columns in the real scenario.
PS.: It is probably relevant to know that I'm using SQLite.
Update
#Felix.leg asked an excellent question and I noticed that I was stuck with the idea of always using automatically generated PK values for both tables. The truth is that child_id could be defined from parent_id:
current_child = 2 * parent_id - 1
previous_child = 2 * parent_id
With this approach, I believe it would be possible to write a single SQL migration script to create the new table, move the data, and set the right foreign key values.

How to maintain database integrity when table inheritance cannot be used

I have several different component types that each have drastically different data specs to store so each component type needs its own table, but they all share some common columns. I'm most concerned with [component.ID] which must be a unique identifier to a component regardless of component type (unique across many tables).
First Option
My first idea was inheritance where the table for each component type inherits a generic [component] table.
create table if not exists component (
ID long primary key default nextval('component_id_seq'),
typeID long not null references componentType (ID),
manufacturerID long not null references manufacturer (ID),
retailPrice numeric check (retailPrice >= 0.0),
purchasePrice numeric check (purchasePrice >= 0.0),
manufacturerPartNum varchar(255) not null,
isLegacy boolean default false,
check (retailPrice >= purchasePrice)
);
create table if not exists motherboard (
foo long,
bar long
) inherits component; //<-- this guy right here!!
/* there would be many other tables with different specific types of components
which each inherit the [component] table*/
PostgreSQL inheritance has some caveats that seem to make this a bad idea.
Constraints like unique or primary key are not respected by the inheriting table. Even if you specify unique in the inheriting table it would only be unique in that table and could duplicate values in the parent table or other inheriting tables.
References do not carry over from the parent table. So the references for typeID or manufacturerID would not apply to the inheriting table.
References to the parent table would not include data in the inheriting tables. This is the worst deal breaker for me using inheritance because I need to be able to reference to all components regardless of type.
Second Option
If I don't use inheritance and just use the component table as a master component list with data common to any component of any type and then have a table for each type of component where each entry refers to a component.ID. that works fine but how do I enforce it?
How do I enforce that each entry in the component table has one and only one corresponding entry in only one of many other tables? The part that baffles me is that there are many tables and the corresponding entry could be in any of them.
A simple reference back to the component table will ensure that each row in the many specific component type tables has one valid component.id to which it belongs.
Third Option
Last of all I could forego a master component table altogether and just have each table for a specific component type have those same columns. Then I am left with the conundrum of how to enforce a unique component ID across many tables and also how to search across all these many tables (which may very well grow or shrink) in queries. I don't want a huge unwieldy UNION between all these tables. That would bog any select query to frozen molasses speed.
Fourth Option
This strikes me as a problem that comes up from time to time is DB design and there is probably a name for it that I don't know and perhaps a solution that is different entirely from the above three options.
The foreign key should contain the type of a subcomponent, the example speaks for itself.
create table component(
id int generated always as identity primary key,
-- or
-- id serial primary key,
type_id int not null,
general_info text,
unique (type_id, id)
);
create table subcomponent_1 (
id int primary key,
type_id int generated always as (1) stored,
-- or
-- type_id int default 1 check(type_id = 1),
specific_info text,
foreign key(type_id, id) references component(type_id, id)
);
insert into component (type_id, general_info)
values (1, 'component type 1');
insert into subcomponent_1 (id, specific_info)
values (1, 'specific info');
Note:
update component
set type_id = 2
where id = 1;
ERROR: update or delete on table "component" violates foreign key constraint "subcomponent_1_type_id_id_fkey" on table "subcomponent_1"
DETAIL: Key (type_id, id)=(1, 1) is still referenced from table "subcomponent_1".

SQLite database with multi-valued properties

I want to create a SQLITE database for storing objects. The objects have properties with multiple values for which I have created separate tables.
CREATE TABLE objs
(
id INTEGER,
name TEXT
);
CREATE TABLE prop1
(
id INTEGER,
value TEXT,
FOREIGN KEY(id) REFERENCES objs(id)
);
CREATE TABLE prop2
(
id INTEGER,
value TEXT,
FOREIGN KEY(id) REFERENCES objs(id)
);
For a list of ids I get as a result of JOINs, I want to find values of these two properties. For that, I am performing the JOINs followed by another JOIN with the 'prop1' table. I then repeat this for 'prop2' table. I suspect this is inefficient (too many joins) and can be improved. I have two questions.
Is this the correct way to design the DB ?
What is the most efficient way of extracting values of the properties I want ?
I would suggest the following structure.
CREATE TABLE objs
(
id INTEGER,
name TEXT
);
CREATE TABLE properties
(
id INTEGER,
Property_name varchar(50),
Property_type varchar(10),
value TEXT,
FOREIGN KEY(id) REFERENCES objs(id)
);
Storing all the different types of properties in different table is a very bad idea. You can just store the property name and type(string, numeric etc.). You can also add multiple value columns like numeric_value, string_value and so on.

Constraint To Prevent Adding Value Which Exists In Another Table

I would like to add a constraint which prevents adding a value to a column if the value exists in the primary key column of another table. Is this possible?
EDIT:
Table: MasterParts
MasterPartNumber (Primary Key)
Description
....
Table: AlternateParts
MasterPartNumber (Composite Primary Key, Foreign Key to MasterParts.MasterPartNumber)
AlternatePartNumber (Composite Primary Key)
Problem - Alternate part numbers for each master part number must not themselves exist in the master parts table.
EDIT 2:
Here is an example:
MasterParts
MasterPartNumber Decription MinLevel MaxLevel ReOderLevel
010-00820-50 Garmin GTN™ 750 1 5 2
AlternateParts
MasterPartNumber AlternatePartNumber
010-00820-50 0100082050
010-00820-50 GTN750
only way I could think of solving this would be writing a checking function(not sure what language you are working with), or trying to play around with table relationships to ensure that it's unique
Why not have a single "part" table with an "is master part" flag and then have an "alternate parts" table that maps a "master" part to one or more "alternate" parts?
Here's one way to do it without procedural code. I've deliberately left out ON UPDATE CASCADE and ON DELETE CASCADE, but in production I'd might use both. (But I'd severely limit who's allowed to update and delete part numbers.)
-- New tables
create table part_numbers (
pn varchar(50) primary key,
pn_type char(1) not null check (pn_type in ('m', 'a')),
unique (pn, pn_type)
);
create table part_numbers_master (
pn varchar(50) primary key,
pn_type char(1) not null default 'm' check (pn_type = 'm'),
description varchar(100) not null,
foreign key (pn, pn_type) references part_numbers (pn, pn_type)
);
create table part_numbers_alternate (
pn varchar(50) primary key,
pn_type char(1) not null default 'a' check (pn_type = 'a'),
foreign key (pn, pn_type) references part_numbers (pn, pn_type)
);
-- Now, your tables.
create table masterparts (
master_part_number varchar(50) primary key references part_numbers_master,
min_level integer not null default 0 check (min_level >= 0),
max_level integer not null default 0 check (max_level >= min_level),
reorder_level integer not null default 0
check ((reorder_level < max_level) and (reorder_level >= min_level))
);
create table alternateparts (
master_part_number varchar(50) not null references part_numbers_master (pn),
alternate_part_number varchar(50) not null references part_numbers_alternate (pn),
primary key (master_part_number, alternate_part_number)
);
-- Some test data
insert into part_numbers values
('010-00820-50', 'm'),
('0100082050', 'a'),
('GTN750', 'a');
insert into part_numbers_master values
('010-00820-50', 'm', 'Garmin GTN™ 750');
insert into part_numbers_alternate (pn) values
('0100082050'),
('GTN750');
insert into masterparts values
('010-00820-50', 1, 5, 2);
insert into alternateparts values
('010-00820-50', '0100082050'),
('010-00820-50', 'GTN750');
In practice, I'd build updatable views for master parts and for alternate parts, and I'd limit client access to the views. The updatable views would be responsible for managing inserts, updates, and deletes. (Depending on your company's policies, you might use stored procedures instead of updatable views.)
Your design is perfect.
But SQL isn't very helpful when you try to implement such a design. There is no declarative way in SQL to enforce your business rule. You'll have to write two triggers, one for inserts into masterparts, checking the new masterpart identifier doesn't yet exist as an alias, and the other one for inserts of aliases checking that the new alias identifier doesn't yet identiy a masterpart.
Or you can do this in the application, which is worse than triggers, from the data integrity point of view.
(If you want to read up on how to enforce constraints of arbitrary complexity within an SQL engine, best coverage I have seen of the topic is in the book "Applied Mathematics for Database Professionals")
Apart that it sounds like a possibly poor design,
You in essence want values spanning two columns in different tables, to be unique.
In order to utilize DBs native capability to check for uniqueness, you can create a 3rd, helper column, which will contain a copy of all the values inside the wanted two columns. And that column will have uniqueness constraint. So for each new value added to one of your target columns, you need to add the same value to the helper column. In order for this to be an inner DB constraint, you can add this by a trigger.
And again, needing to do the above, sounds like an evidence for a poor design.
--
Edit:
Regarding your edit:
You say " Alternate part numbers for each master part number must not themselves exist in the master parts table."
This itself is a design decision, which you don't explain.
I don't know enough about the domain of your problem, but:
If you think of master and alternate parts, as totally different things, there is no reason why you may want "Alternate part numbers for each master part number must not themselves exist in the master parts table". Otherwise, you have a common notion of "parts" be it master or alternate. This means they need to be in the same table, and column.
If the second is true, you need something like this:
table "parts"
columns:
id - pk
is_master - boolean (assuming a part can not be master and alternate at the same time)
description - text
This tables role is to list and describe the parts.
Then you have several ways to denote which part is alternate to which. It depends on whether a part can be alternate to more than one part. And it sounds that anyway one master part can have several alternates.
You can do it in the same table, or create another one.
If same: add column: alternate_to, which will be null for master parts, and will have a foreign key into the id column of the same table.
Otherwise create a table, say "alternatives" with: master_id, alternate_id both referencing with a foreign key to the parts table.
(The first above assumes that a part cannot be alternate to more than one other part. If this is not true, the second will work anyway)