Database design for a template based evaluation system - sql

We are working on a database to store some evaluations we conduct. There are a few different types of evaluations and some have changed over time. Because of this we need to keep a record of exactly what an evaluation looked like when it was undertaken.
I figured that the best way to support this would be through a template style system.
With:
A table saving all possible options;
A table mapping options to a template;
An evaluations table mapping a participant to a template on a date/time; and
A table mapping evaluator comments to an option of an evaluation.
This is a skeleton for the design:
CREATE TABLE options (
id SERIAL PRIMARY KEY,
option TEXT NOT NULL
);
CREATE TABLE templates (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE template_options (
template INTEGER NOT NULL REFERENCES templates( id ),
option INTEGER NOT NULL REFERENCES options( id ),
UNIQUE ( template, option )
);
CREATE TABLE participants (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE evaluations (
id SERIAL PRIMARY KEY,
template INTEGER NOT NULL REFERENCES templates( id ),
participant INTEGER NOT NULL REFERENCES participants( id ),
date TIMESTAMP WITH TIME ZONE NOT NULL
);
CREATE TABLE evaluation_data (
template INTEGER NOT NULL REFERENCES templates( id ),
option INTEGER NOT NULL REFERENCES options( id ),
evaluator_comments TEXT NOT NULL,
);
The design is able to capture our data but doesn't restrict the options saved in evaluation_data to the subset specified in the evaluation's template's option mapping. We could probably enforce it with a trigger (we can definitely do it with application logic [we are doing so at the moment]) but are we going down the wrong path with this design?
Can anybody think of a better way to do it?
Edit:
Added an example of a potential trigger we would need to use to ensure valid options are enforced with this design.
CREATE FUNCTION valid_option() RETURNS trigger as $valid_option$
BEGIN
IF NOT NEW.option IN ( SELECT template_options.option
FROM template_options
INNER JOIN templates
ON template_options.template = templates.id
WHERE templates.id = ( SELECT evaluations.template
FROM evaluations
WHERE evaluations.id = NEW.evaluation ) ) THEN
RAISE EXCEPTION 'This option is not mapped for this evaluations template.';
END IF;
RETURN NEW;
END
$valid_option$ LANGUAGE plpgsql;
CREATE TRIGGER valid_option BEFORE INSERT ON evaluation_data FOR EACH ROW EXECUTE PROCEDURE valid_option();

Remember that you need two sets of tables. The first set containing the assessment, questions, answer alternatives, categories(?) needed to display the assessment to the participant. The second set of tables to record data about the evaluation (ie. the participant taking the assessment): which assessment, which questions, which answer alternatives and in which order they were presented, which answer they entered (are they allowed to answer the same question multiple times?), etc.
We're using the following structure (I've removed topic scoring since you didn't ask about it):
Models for presenting an assessment:
Assessment: assessment_name, passing_status, version
Question: assessment, question_number, question_type, question_text
AnswerAlternative: question, correct?, answer_text, points
Models for recording an evaluation (participant taking an assessment):
Progress: started_timestamp, finished_timestamp, last_activity, status (includes "finished")
Result: user, assessment, progress, currently_active, score, passing_grade?
Answer: result, question, selected_answer_alternative, answer_text, score
To achieve your goal, I would augment this by writing the generated evaluation to a table and pointing to it from Reault. You could also record the selection and presentation criteria so you can re-generate the assessment programmatically (ie. if you're selecting the questions from a larger question db and re-ordering the answer alternatives before presenting them to the participant).

Related

How to create a trigger that automatically generates a primary key from multiple fields

I have a geospatial db with (a.o.) a table with locations, and a table with features. The primary key for the locations table is location_id. Location_id is also a foreign key in the features table. The features table also includes the fields "type" (in which a two-letter code is entered to denote particular types of features), and N (which differentiates the different features that may be linked to one location). I figured a combination of location_id, type, and N would make a decent primary key for the features table. Previously, I entered these ids manually. However, I would like for this to be automatically done when a "user" enters a location ID, N, and type. (Ideally I want to find a way to automatically generate the correct N, so that "users" need only enter location_id and type, but I think this should be posted as a separate question?).
I have been trying to achieve this via triggers (see code below), but when I test it by trying to add a new data row to my features table, I get the error message "duplicate key value violates unique constraint features_pkey". Could someone point me in the direction of help for this issue?
CREATE OR REPLACE FUNCTION set_features_id()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS
$$
DECLARE
compos_id text;
BEGIN
SELECT loc_id || type || N FROM features INTO compos_id;
NEW.id := compos_id;
RETURN NEW;
END;
$$;
DROP TRIGGER IF EXISTS set_lf_id_trigger on public.landscape_features_point;
CREATE TRIGGER set_features_id_trigger
BEFORE INSERT
ON "features"
FOR EACH ROW
EXECUTE PROCEDURE set_features_id();

Is there a way to prevent insert for specific cases while without unique?

I'm setting up Postgres table to organize active store commissions in my company and I need to set a rule for preventing insert errors.
I've tried to create a table without constraints and set the rules inside a Python script. Although this solution works, It does not prevent me or others from messing around when trying to update my table.
create table my_store_commissions (
ID SERIAL PRIMARY KEY ,
STORE_ID INTEGER,
COMMISSION NUMERIC,
IS_ACTIVE BOOLEAN
);
insert into my_store_commissions
values (1,100,0.90, False),
(2,100,0.89, False),
(3,100,0.78, False),
(4,100,0.78, True),
-- This code should not run
insert into my_store_commissions
values (5,100, 0.90, True)
I need to be able to have more than one store_id with is_active = False but only one store_id with is_active = True.
What you want is a partial [unique] index. That is, a unique index with a filter:
create unique index unq_my_store_commissions_store_id_active
on my_store_commissions(store_id)
where is_active;
Note that these can be tricky to handle when switching the active store. You may need to deactivate the previous store before setting the new one.

Making a column unique with one exception

We have an application whose work flow involves submitting information to an outside group and then inputting the user's id number into the system.
For that reason we allow a set default value "00000000" to be put into the id field as a tentative value before the entry is approved and a permanent one is put in.
What I'm looking for is essentially a way to ensure that the column remains unique except for that one value.
What I'm basically looking for is a UNIQUE constraint, however instead of NULL being the blank option it being "00000000". I've considered doing it as part of a CHECK constraint, however that seems like it'd be a big performance hit. (Under the assumption that UNIQUE does some kind of indexing)
Use Filtered Index
as the Following:-
CREATE UNIQUE NONCLUSTERED INDEX idx_yourcolumn_notspecificvalue
ON YourTable(yourcolumn)
WHERE yourcolumn != "00000000";
Example:
-- Create Table
Create table Test (id int identity, code varchar (100))
-- Create Unique Filtered Index
CREATE UNIQUE NONCLUSTERED INDEX idx_MyCol_Filtered
ON Test(code)
WHERE code != '00000000';
-- Insert Dumy Data >> '00000000' is repeated and '0101' is once
insert into Test (code)
Values ('00000000'),
('00000000'),
('00000000'),
('0101')
select * from Test
The Result:
-- Now try inserting '0101' again
insert into Test (code) Values ('0101')
The Result:
For more details:
Create Filtered Indexes
Approving the user entry through work flow sound like very crucial business logic. I would like to suggest that generate random but unique (like time stamp) number and insert with new user entry. Keep additional column which differentiate ( flag) approved entries from unapproved entries. Once the user gets approval from work flow, update the id and flag.

Is it possible to CREATE TABLE with a column that is a combination of other columns in the same table?

I know that the question is very long and I understand if someone doesn't have the time to read it all, but I really wish there is a way to do this.
I am writing a program that will read the database schema from the database catalog tables and automatically build a basic application with the information extracted from the system catalogs.
Many tables in the database can be just a list of items of the form
CREATE TABLE tablename (id INTEGER PRIMARY KEY, description VARCHAR NOT NULL);
so when a table has a column that references the id of tablename I just resolve the descriptions by querying it from the tablename table, and I display a list in a combo box with the available options.
There are some tables however that cannot directly have a description column, because their description would be a combination of other columns, lets take as an example the most important of those tables in my first application
CREATE TABLE bankaccount (
bankid INTEGER NOT NULL REFERENCES bank,
officeid INTEGER NOT NULL REFERENCES bankoffice,
crc INTEGER NOT NULL,
number BIGINT NOT NULL
);
this as many would know, would be the full account number for a bank account, in my country it's composed as follows
[XXXX][XXXX][XX][XXXXXXXXXX]
^ ^ ^ ^
bank id | crc account number
|
|_ bank office id
so that's the reason of the way my bankaccount table is structured as is.
Now, I would like to have the complete bank account number in a description column so I can display it in the application without giving a special treatment to this situation, since there are some other tables with similar situation, something like
CREATE TABLE bankaccount (
bankid INTEGER NOT NULL REFERENCES bank,
officeid INTEGER NOT NULL REFERENCES bankoffice,
crc INTEGER NOT NULL,
number BIGINT NOT NULL,
description VARCHAR DEFAULT bankid || '-' || officeid || '-' || crc || '-' || number
);
Which of course doesn't work since the following error is raised1
ERROR: cannot use column references in default expression
If there is any different approach that someone can suggest, please feel free to suggest it as an answer.
1 This is the error message given by PostgreSQL.
What you want is to create a view on your table. I'm more familiar with MySQL and SQLite, so excuse the differences. But basically, if you have table 'AccountInfo' you can have a view 'AccountInfoView' which is sort of like a 'stored query' but can be used like a table. You would create it with something like
CREATE VIEW AccountInfoView AS
SELECT *, CONCATENATE(bankid,officeid,crc,number) AS FullAccountNumber
FROM AccountInfo
Another approach is to have an actual FullAccountNumber column in your original table, and create a trigger that sets it any time an insert or update is performed on your table. This is usually less efficient though, as it duplicates storage and takes the performance hit when data are written instead of retrieved. Sometimes that approach can make sense, though.
What actually works, and I believe it's a very elegant solution is to use a function like this one
CREATE FUNCTION description(bankaccount) RETURNS VARCHAR AS $$
SELECT
CONCAT(bankid, '-', officeid, '-', crc, '-', number)
FROM
bankaccount this
WHERE
$1.bankid = this.bankid AND
$1.officeid = this.officeid AND
$1.crc = this.crc AND
$1.number = this.number
$$ LANGUAGE SQL STABLE;
which would then be used like this
SELECT bankaccount.description FROM bankaccount;
and hence, my goal is achieved.
Note: this solution works with PostgreSQL only AFAIK.

What the best way to self-document "codes" in a SQL based application?

Q: Is there any way to implement self-documenting enumerations in "standard SQL"?
EXAMPLE:
Column: PlayMode
Legal values: 0=Quiet, 1=League Practice, 2=League Play, 3=Open Play, 4=Cross Play
What I've always done is just define the field as "char(1)" or "int", and define the mnemonic ("league practice") as a comment in the code.
Any BETTER suggestions?
I'd definitely prefer using standard SQL, so database type (mySql, MSSQL, Oracle, etc) should't matter. I'd also prefer using any application language (C, C#, Java, etc), so programming language shouldn't matter, either.
Thank you VERY much in advance!
PS:
It's my understanding that using a second table - to map a code to a description, for example "table playmodes (char(1) id, varchar(10) name)" - is very expensive. Is this necessarily correct?
The normal way is to use a static lookup table, sometimes called a "domain table" (because its purpose is to restrict the domain of a column variable.)
It's up to you to keep the underlying values of any enums or the like in sync with the values in the database (you might write a code generator to generates the enum from the domain table that gets invoked when the something in the domain table gets changed.)
Here's an example:
--
-- the domain table
--
create table dbo.play_mode
(
id int not null primary key clustered ,
description varchar(32) not null unique nonclustered ,
)
insert dbo.play_mode values ( 0 , "Quiet" )
insert dbo.play_mode values ( 1 , "LeaguePractice" )
insert dbo.play_mode values ( 2 , "LeaguePlay" )
insert dbo.play_mode values ( 3 , "OpenPlay" )
insert dbo.play_mode values ( 4 , "CrossPlay" )
--
-- A table referencing the domain table. The column playmode_id is constrained to
-- on of the values contained in the domain table playmode.
--
create table dbo.game
(
id int not null primary key clustered ,
team1_id int not null foreign key references dbo.team( id ) ,
team2_id int not null foreign key references dbo.team( id ) ,
playmode_id int not null foreign key references dbo.play_mode( id ) ,
)
go
Some people for reasons of "economy" might suggest using a single catch-all table for all such code, but in my experience, that ultimately leads to confusion. Best practice is a single small table for each set of discrete values.
add a foreign key to "codes" table.
the codes table would have the PK be the code value, add a string description column where you enter in the description of the value.
table: PlayModes
Columns: PlayMode number --primary key
Description string
I can't see this as being very expensive, databases are based on joining tables like this.
That information should be in database somewhere and not on comments.
So, you should have a table containing that codes and prolly a FK on your table to it.
I agree with #Nicholas Carey (+1): Static data table with two columns, say “Key” or “ID” and “Description”, with foreign key constraints on all tables using the codes. Often the ID columns are simple surrogate keys (1, 2, 3, etc., with no significance attached to the value), but when reasonable I go a step further and use “special” codes. Following are a few examples.
If the values are a sequence (say, Ordered, Paid, Processed, Shipped), I might use 1, 2, 3, 4, to indicate sequence. This can make things easier if you want to find all “up through” a give stages, such as all orders that have not yet been shipped (ID < 4). If you are into planning ahead, make them 10, 20, 30, 40; this will allow you to add values “in between” existing values, if/when new codes or statuses come along. (Yes, you cannot and should not try to anticipate everything and anything that might have to be done some day, but a bit of pre-planning like this can make some changes that much simpler.)
Keys/Ids are often integers (1 byte, 2 byte, 4 byte, whatever). There’s little cost to make them character values (1 char, 2 char, 3, char, 4 char). That’s character, not variable character. Done this way, you can have mnemonics on your codes, such as
O, P, R, S
Or, Pd, Pr, Sh
Ordr, Paid, Proc, Ship
…or whatever floats your boat. Done this way, I have found that it can save a lot of time when analyzing or debugging. You still want the lookup table, for relational integrity as well as a reminder for the more obscure codes.