SQLite database with multi-valued properties - sql

I want to create a SQLITE database for storing objects. The objects have properties with multiple values for which I have created separate tables.
CREATE TABLE objs
(
id INTEGER,
name TEXT
);
CREATE TABLE prop1
(
id INTEGER,
value TEXT,
FOREIGN KEY(id) REFERENCES objs(id)
);
CREATE TABLE prop2
(
id INTEGER,
value TEXT,
FOREIGN KEY(id) REFERENCES objs(id)
);
For a list of ids I get as a result of JOINs, I want to find values of these two properties. For that, I am performing the JOINs followed by another JOIN with the 'prop1' table. I then repeat this for 'prop2' table. I suspect this is inefficient (too many joins) and can be improved. I have two questions.
Is this the correct way to design the DB ?
What is the most efficient way of extracting values of the properties I want ?

I would suggest the following structure.
CREATE TABLE objs
(
id INTEGER,
name TEXT
);
CREATE TABLE properties
(
id INTEGER,
Property_name varchar(50),
Property_type varchar(10),
value TEXT,
FOREIGN KEY(id) REFERENCES objs(id)
);
Storing all the different types of properties in different table is a very bad idea. You can just store the property name and type(string, numeric etc.). You can also add multiple value columns like numeric_value, string_value and so on.

Related

Oracle SQL: "GENERATED ALWAYS" with a specified sequence

I have two tables that I would like to let them share the same sequence to populate the primary key ID column. However, I also don't want the user to specify or change the value for the ID column.
By using the code below, I can let two tables share the same sequence.
CREATE TABLE T1
(
ID INTEGER DEFAULT SEQ_1.nextval NOT NULL
);
This code will use its own sequence and prevent users from changing or specifying with INSERT:
CREATE TABLE T1
(
ID INTEGER GENERATED ALWAYS AS IDENTITY NOT NULL
);
Is there a way that can both world? Something like this:
CREATE TABLE T1
(
ID INTEGER GENERATED ALWAYS AS ( SEQ_1.nextval ) NOT NULL
);
Regarding the use case, as #Sujitmohanty30 asked, the reason that I raised this question:
I'm thinking to implement inheritance in the database, consider this UML diagram (I can't directly post images due to insufficient reputation, and sorry for being lack of imagination).
ANIMAL is abstract and all inheritance is mandatory. This means no instance of ANIMAL should be created. Furthermore, there is an one-to-many relationship between ANIMAL and ZOO_KEEPER.
Therefore, I came up with this idea:
CREATE SEQUENCE ANIMAL_ID_SEQ;
CREATE TABLE HORSE
(
ID INT DEFAULT ANIMAL_ID_SEQ.nextval NOT NULL PRIMARY KEY,
HEIGHT DECIMAL(3, 2) NOT NULL
);
CREATE TABLE DOLPHIN
(
ID INT DEFAULT ANIMAL_ID_SEQ.nextval NOT NULL PRIMARY KEY,
LENGTH DECIMAL(3, 2) NOT NULL
);
CREATE MATERIALIZED VIEW LOG ON HORSE WITH ROWID;
CREATE MATERIALIZED VIEW LOG ON DOLPHIN WITH ROWID;
CREATE MATERIALIZED VIEW ANIMAL
REFRESH FAST ON COMMIT
AS
SELECT 'horse' AS TYPE, ROWID AS RID, ID -- TYPE column is used as a UNION ALL marker
FROM HORSE
UNION ALL
SELECT 'dolphin' AS TYPE, ROWID AS RID, ID
FROM DOLPHIN;
ALTER TABLE ANIMAL
ADD CONSTRAINT ANIMAL_PK PRIMARY KEY (ID);
CREATE TABLE ZOO_KEEPER
(
NAME VARCHAR(50) NOT NULL PRIMARY KEY,
ANIMAL_ID INT NOT NULL REFERENCES ANIMAL (ID)
);
In this case, the use of the shared sequence is to avoid collision in ANIMAL mview. It uses DEFAULT to get the next ID of the shared sequence. However, using DEFAULT doesn't prevent users from manually INSERTing the ID field or UPDATE the value of it.
You can create a master view/table and generate the sequence in it.
Then copy it as column values into both tables while inserting.
Another option could be inserting into both tables at same time.Use SEQ.NEXTVAL to insert into first table to get a new ID, and then SEQ.CURRVAL to copy same id in the table.
No, you cant have anything like this because ID is independently generated for each of the tables and this can be done only using sequence when you are inserting the data in both the tables at the same time.
You should normalize your data schema: add column animal_type into the table and create composite primary key on both columns

How do you ensure values from a logging table match objects in other tables ?

I have three tables. Two basic tables listing objects and a third table logging changes in database. Here is an example.
create table individual (ind_id integer, age integer, name varchar);
create table organisation (org_id integer, city varchar, name varchar);
create TABLE log_table (log_id integer, object_id integer, table_name varchar, information json, log_date date);
I want to ensure that any row in the log_table corresponds to an existing object in either the individual table or the organisation table. This means that the insertion
insert into log_table (object_id,table_name,information,log_date) values (13,'organisation','{"some":"interesting","information":"on the changes"}','2017-11-09');
is valid only if the table organisation contains a record with the ID 13.
How can I do that in PostgreSQL ? If this is not possible, then I suppose I will have to create one column for the individual table and one for the organisation table in the log_table.
You need an entity table:
create table entity (
entity_id serial primary key,
entity_type text check (entity_type in ('individual','organization'))
)
create table individual (
ind_id integer primary key references entity (entity_id),
age integer, name varchar
);
create table organisation (
org_id integer primary key references entity (entity_id),
city varchar, name varchar
);
create TABLE log_table (
log_id integer primary key,
entity_id integer references entity (entity_id),
information json, log_date date
);
You could also use triggers to solve this problem . Seperate triggers can be made on individual and organisation table which could be on before update ,after update , after insert actions .
You could add one column in log table which would correspond to action performed in base table i.e update or insert .
Also you could add unique constraint on table name and object id .
This would eventually lead to logging every possible operation in table without changing in application code .
Hope this helps !
Starting from your current design you can enforce what you want declaratively by adding to each entity table a constant checked or computed/virtual table/type variant/tag column and a FK (foreign key) (id, table) to the log table.
You have two kinds/types of logged entities. Google sql/database subtypes/polymorphism/inheritance. Or (anti-pattern) 2/many/multiple FKs to 2/many/multiple tables.

Add files to multiple tables M:N

What is The best Data model for Add multiple files to The multiple tables? I have for example 5 tables articles, blogs, posts... and for each item I would like to store multiple files. Files table contains only filepaths (not physicaly files).
Example:
Im using The links table, but when I create in the future The new table for example "comments", then I need to add new column to The links table.
Is there a better way of modeling such data?
One way to solve this is to use the table inheritance pattern. The main idea is to have a base table (let's call it content) with general shared information about all the items (e.g., creation date) and most importantly, the relationship with files. Then, you may add additional content types in the future without having to worry about their relation to files, since the content parent type already handles it.
E.g.:
CREATE TABLE flies (
id NUMERIC PRIMARY KEY,
path VARCHAR(100) NOT NULL
);
CREATE TABLE content (
id NUMERIC PRIMARY KEY,
created TIMESTAMP NOT NULL
);
CREATE TABLE links (
file_id NUMERIC NOT NULL REFERENCES files(id),
content_id NUMERIC NOT NULL REFERENCES content(id),
PRIMARY KEY (file_id, content_id)
);
CREATE TABLE articles (
id NUMERIC PRIMARY KEY REFERENCES content(id),
title VARCHAR(400),
subtitle VARCHAR(400)
);
-- etc...

Storing arbitrary attributes on tables

I have 3 tables, x, y, and z. I want to be able to attach arbitrary
attributes to each row in each table. x, y, and z have nothing in
common other than the fact that they all have an integer primary key called
id and should be able to have arbitrary attributes attached to them.
Is it better to make a single attributes table, like
create table attributes (
table enum('x', 'y', 'z'),
xyz_id integer,
name varchar(50),
value text,
primary key (table, xyz_id, name)
);
Or is it best to make separate tables, like
create table x_attributes (
x_id integer,
name varchar(50),
value text,
primary key (x_id, name),
foreign key (x_id) references x (id)
);
create table y_attributes (...);
create table z_attributes (...);
The second option (separate tables) seems to be cleaner, but requires a lot
more boilerplate on both the database side and the application side.
I'm also open to suggestions other than those two.
Note: I've considered the possibility of using a document store like MongoDB, but
the data I'm working with is fundamentally relational.
Go with one table with an enum column, it will make grabbing all of the attributes for each row easier in the long run.

Polymorphism in SQL database tables?

I currently have multiple tables in my database which consist of the same 'basic fields' like:
name character varying(100),
description text,
url character varying(255)
But I have multiple specializations of that basic table, which is for example that tv_series has the fields season, episode, airing, while the movies table has release_date, budget etc.
Now at first this is not a problem, but I want to create a second table, called linkgroups with a Foreign Key to these specialized tables. That means I would somehow have to normalize it within itself.
One way of solving this I have heard of is to normalize it with a key-value-pair-table, but I do not like that idea since it is kind of a 'database-within-a-database' scheme, I do not have a way to require certain keys/fields nor require a special type, and it would be a huge pain to fetch and order the data later.
So I am looking for a way now to 'share' a Primary Key between multiple tables or even better: a way to normalize it by having a general table and multiple specialized tables.
Right, the problem is you want only one object of one sub-type to reference any given row of the parent class. Starting from the example given by #Jay S, try this:
create table media_types (
media_type int primary key,
media_name varchar(20)
);
insert into media_types (media_type, media_name) values
(2, 'TV series'),
(3, 'movie');
create table media (
media_id int not null,
media_type not null,
name varchar(100),
description text,
url varchar(255),
primary key (media_id),
unique key (media_id, media_type),
foreign key (media_type)
references media_types (media_type)
);
create table tv_series (
media_id int primary key,
media_type int check (media_type = 2),
season int,
episode int,
airing date,
foreign key (media_id, media_type)
references media (media_id, media_type)
);
create table movies (
media_id int primary key,
media_type int check (media_type = 3),
release_date date,
budget numeric(9,2),
foreign key (media_id, media_type)
references media (media_id, media_type)
);
This is an example of the disjoint subtypes mentioned by #mike g.
Re comments by #Countably Infinite and #Peter:
INSERT to two tables would require two insert statements. But that's also true in SQL any time you have child tables. It's an ordinary thing to do.
UPDATE may require two statements, but some brands of RDBMS support multi-table UPDATE with JOIN syntax, so you can do it in one statement.
When querying data, you can do it simply by querying the media table if you only need information about the common columns:
SELECT name, url FROM media WHERE media_id = ?
If you know you are querying a movie, you can get movie-specific information with a single join:
SELECT m.name, v.release_date
FROM media AS m
INNER JOIN movies AS v USING (media_id)
WHERE m.media_id = ?
If you want information for a given media entry, and you don't know what type it is, you'd have to join to all your subtype tables, knowing that only one such subtype table will match:
SELECT m.name, t.episode, v.release_date
FROM media AS m
LEFT OUTER JOIN tv_series AS t USING (media_id)
LEFT OUTER JOIN movies AS v USING (media_id)
WHERE m.media_id = ?
If the given media is a movie,then all columns in t.* will be NULL.
Consider using a main basic data table with tables extending off of it with specialized information.
Ex.
basic_data
id int,
name character varying(100),
description text,
url character varying(255)
tv_series
id int,
BDID int, --foreign key to basic_data
season,
episode
airing
movies
id int,
BDID int, --foreign key to basic_data
release_data
budget
What you are looking for is called 'disjoint subtypes' in the relational world. They are not supported in sql at the language level, but can be more or less implemented on top of sql.
You could create one table with the main fields plus a uid then extension tables with the same uid for each specific case. To query these like separate tables you could create views.
Using the disjoint subtype approach suggested by Bill Karwin, how would you do INSERTs and UPDATEs without having to do it in two steps?
Getting data, I can introduce a View that joins and selects based on specific media_type but AFAIK I cant update or insert into that view because it affects multiple tables (I am talking MS SQL Server here). Can this be done without doing two operations - and without a stored procedure, natually.
Thanks
Question is quite old but for modern postresql versions it's also worth considering using json/jsonb/hstore type.
For example:
create table some_table (
name character varying(100),
description text,
url character varying(255),
additional_data json
);