SQL Table with mixed data type field Best Practice - sql

everyone,
I would like an advice on best practice for creating realtional database structure with field having mixed data type.
I have 'datasets' (some business objects) and I would like to have list of parameters, associated with each dataset. And those parameters can have different types - strings, integers, float and json values.
What would be the best structure for the parameters table? Should I have single column with string type?
CREATE TABLE param_desc (
id serial PRIMARY KEY,
name varchar NOT NULL,
param_type int -- varchar, int, real, json
);
CREATE TABLE param_value (
id serial PRIMARY KEY,
dataset_id int NOT NULL,
param int NOT NULL REFERENCES param_desc (id),
value varchar NOT NULL,
CONSTRAINT _param_object_id_param_name_id_time_from_key UNIQUE (dataset_id, param)
);
The problem with such approach is that I can't easily cast value for some additional conditions. For example, I want to get all datasets with some specific integer parameter, having int value more than 10. But if I write where clause, the casting will return error, as other non-integer parameters can't be casted.
SELECT dataset_id FROM vw_param_current WHERE name = 'priority' AND value::int > 5
Or should I have 4 separate columns, with 3 of them being NULL for every row?
Or should I have 4 different tables?

Related

How to create a column with the datatype float which only stores up to 3 decimals in PostgreSQL?

I want to create a table in my PostgreSQL:
CREATE TABLE my_table(
id INT GENERATED ALWAYS AS IDENTITY NOT NULL PRIMARY KEY,
description TEXT,
score FLOAT NOT NULL
);
How do I limit the number of decimals stored in the "score" column to a maximum of 3 decimals?
You would use numeric. However, you need a precision as well, which limits the maximum value:
CREATE TABLE my_table(
id INT GENERATED ALWAYS AS IDENTITY NOT NULL PRIMARY KEY,
description TEXT,
score NUMERIC(10, 3) NOT NULL
);
This will store numbers up to 9,999,999.999.

SQLITE3: find IDs across multiple tables

I would like to do analysis of what codes appear in multiple tables under certains conditions. However I don't think the database schema suits the task very well but maybe there's something I don't know about that can help me. Here's a simplified schema:
CREATE TABLE "batchDescription" (
id INTEGER NOT NULL,
name TEXT NOT NULL UNIQUE,
PRIMARY KEY (id)
);
CREATE TABLE "simulationDetails" (
id INTEGER NOT NULL,
ko_index_id INTEGER NOT NULL,
batch_description_id INTEGER NOT NULL,
data1 REAL NOT NULL,
data2 INTEGER NOT NULL,
PRIMARY KEY (id)
FOREIGN KEY(ko_index_id) REFERENCES "koIndex" (id)
FOREIGN KEY(batch_description_id) REFERENCES "batchDescription" (id)
);
CREATE TABLE "koIndex" (
id INTEGER NOT NULL,
number_of_kos INTEGER NOT NULL,
PRIMARY KEY (id)
);
CREATE TABLE "1kos" (
ko_index_id INTEGER NOT NULL,
ko1 INTEGER NOT NULL,
PRIMARY KEY (ko_index_id)
FOREIGN KEY(ko_index_id) REFERENCES "koIndex" (id)
);
CREATE TABLE "2kos" (
ko_index_id INTEGER NOT NULL,
ko1 INTEGER NOT NULL,
ko2 INTEGER NOT NULL,
PRIMARY KEY (ko_index_id)
FOREIGN KEY(ko_index_id) REFERENCES "koIndex" (id)
);
CREATE TABLE "3kos" (
ko_index_id INTEGER NOT NULL,
ko1 INTEGER NOT NULL,
ko2 INTEGER NOT NULL,
ko3 INTEGER NOT NULL,
PRIMARY KEY (ko_index_id)
FOREIGN KEY(ko_index_id) REFERENCES "koIndex" (id)
);
This goes up to table "525kos" which has ko1 to ko525 in it - ko1 to ko525 are IDs that are primary keys in a table not shown here. I want to do an analysis of how often certain IDs are present under certain conditions. Here is a simple example to illustrate:
I would like to like to count the amount of times a certain ID (let's say 127) (in any koX column) in the "13kos" table occurs when simulationDetails.data1 not equal to 0. I would do this on a database called ko.db from the bash command line like:
for ko_idx in {1..13}; do sqlite3 ko.db "select count(ko${ko_idx}) from '13kos' where ko${ko_idx} = 127 and ko_index_id in (select ko_index_id from simulationDetails where data1 != 0);"; done
Already this is slow and inefficient but is simple compared to what I would like to do. What if I wanted to do an analysis of all the IDs in all possible columns in all "Xkos" tables and compare them to where data1 is equal and not equal to zero?
Can anybody direct me to a better way of doing this or is the schema design just not very good for this kind of analysis and I'll have to give up?
EDIT: Thought I'd add a bit of extra detailto avoid confusion. I suspect that a good way to achieve want I want would be to somehow combine all the "Xkos" tables into one temporary table and then search for certain IDs from that table. How would I combine all 525 ko tables without writing out each table name?
How would I combine all 525 ko tables without writing out each table
name?
Create a table with the same number of columns as the largest table (the table into which you merge) allowing nulls.
query the sqlite_master table using something like :-
SELECT * from sqlite_master WHERE name LIKE '%kos%' AND type = 'table'
Loop through the extracted table names building an INSERT SELECT for each table that will insert the rows from the tables into the table created in 1.
See 2. INSERT INTO table SELECT ...; especially in regard to handling missing columns.
All done, the table created in 1 will be populated accordingly.

Save filter criteria on a SQL database

I have a Products table as follows:
create table dbo.Product (
Id int not null
Name nvarchar (80) not null,
Price decimal not null
)
I am creating Baskets (lists of products) as follows:
create table dbo.Baskets (
Id int not null
Name nvarchar (80) not null
)
create table dbo.BasketProducts (
BasketId int not null,
ProductId int not null,
)
A basket is created based on a Search Criteria using parameters:
MinimumPrice;
MaximumPrice;
Categories (can be zero to many);
MinimumWarrantyPeriod
I need to save these parameters so later I know how the basket was created.
In the future I will have more parameters so I see 2 options:
Add MinimumPrice, MaximumPrice and MinimumWarrantyPeriod as columns to Basket table and add a BasketCategories and Categories tables to relate a Basket to Categories.
Create a more flexible design using a Parameters table:
create table dbo.BasketParameters (
BasketId int not null,
ParameterTypeId int not null,
Value nvarchar (400) not null
)
create table dbo.ParameterType (
Id int not null
Name nvarchar (80) not null
)
Parameter types are MinimumPrice, MaximumPrice, Categories, MinimumWarrantyPeriod, etc.
So for each Basket I have a list of BasketParameters, all different, having each on value. Later if I need for parameter types I add them to the ParameterType table ...
The application will be responsible for using each Basket Parameters to build the Basket ... I will have, for example, a Categories table but will be decoupled from the BasketParameters.
Does this make sense? Which approach would you use?
Your first option is superior (especially since you are using a relational data store. I.e. SQL Server), since it is properly referential. This will be much easier to maintain and query as well as far more performant.
Your second solution is equivalent to an EVA table: https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
Which are usually a terrible idea (and if you need that type of flexibility you should probably use a Document Database or other NoSQL Solution instead). The only benefit to this is if you need to add/remove attributes regularly or based on other criteria.

SQLite variable CHECK constraint

I have table with chromosomes (objects that have length) and table with regions (for example genes) on the chromosomes (objects that have range defined as two integers - position start and position end). I would like to forbid inserting into database regions with coordinates greater than length of particular chromosome.
Is it possible in SQLite?
If not is it possible in any other SQL system (preferably free)?
DROP TABLE IF EXISTS chromosomes;
CREATE TABLE chromosomes
(
chromosome_id INTEGER UNIQUE NOT NULL CHECK(TYPEOF(chromosome_id) = 'integer'),
name VARCHAR UNIQUE NOT NULL CHECK(TYPEOF(name) = 'text'),
length INTEGER NOT NULL CHECK(TYPEOF(length) = 'integer' AND length > 0),
PRIMARY KEY (chromosome_id)
);
DROP TABLE IF EXISTS genes;
CREATE TABLE genes
(
gene_id INTEGER UNIQUE NOT NULL CHECK(TYPEOF(gene_id) = 'integer'),
symbol VARCHAR NOT NULL CHECK(TYPEOF(symbol) = 'text'),
refseq_id VARCHAR NOT NULL CHECK(TYPEOF(refseq_id) = 'text'),
chromosome_id INTEGER NOT NULL CHECK(TYPEOF(chromosome_id) = 'integer'),
start INTEGER NOT NULL CHECK(TYPEOF(start) = 'integer' AND start > 0 AND start < end),
end INTEGER NOT NULL CHECK(TYPEOF(end) = 'integer' AND end > 0 AND end > start),
external_db_link VARCHAR NOT NULL CHECK(TYPEOF(external_db_link) = 'text'),
PRIMARY KEY (gene_id)
FOREIGN KEY (chromosome_id) REFERENCES chromosomes(chromosome_id)
);
This type of constraint is not easily available in any database. In general, this would be handled using a trigger. The problem is that it is a constraint between two tables, but it does not use equality.
Triggers are available in SQLite as well as other databases.
One work-around is a check constraint using a user-defined function. The function can do the lookup into the chromosomes table and be used in a check constraint. SQLite doesn't really have user-defined functions. One database that supports this is Postgres.
Another option is to wrap all data modifications in stored procedures (this tends to be the way that I design systems). Then the stored procedure can do all the checks that are needed.
Redundantly - bring 'length' into the child table using a foreign key.
Then your Check Constraint can reference that.

What data type use for taxrate field

What data type should I use for tax rate field in my RoR application. I want it to store only numbers from 0 to 100 and two fixed strings "zw" and "np". Should I use string type and parse it to integer when it's number?
I think, perhaps, you should think about the tax rate as a name of the rate rather than a numeric value.
Create a reference table of tax rates, something like:
create table TaxRates (
TaxRateId int primary key, -- auto_increment/serial/identity
Name varchar(255) not null unique,
Value int, -- NULL if not appropriate
<more columns if necessary>
);
The drop-down list can use this table for the names. Most of the names will be numbers, but that is ok. The actual numeric value will be in the Value field (which you might really want to be a decimal).
Any table that uses a tax rate would have a foreign key reference to TaxRateId.