How to replace compound primary key in table? - sql

I have following table:
CREATE TABLE "PostViews"
(
"createdAt" timestamp with time zone NOT NULL,
"updatedAt" timestamp with time zone NOT NULL,
"PostId" integer NOT NULL,
"UserId" integer NOT NULL,
CONSTRAINT "PostViews_pkey" PRIMARY KEY ("PostId", "UserId"),
CONSTRAINT "PostViews_PostId_fkey" FOREIGN KEY ("PostId")
REFERENCES "Posts" (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT "PostViews_UserId_fkey" FOREIGN KEY ("UserId")
REFERENCES "Users" (id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
)
It is join-through table which links many Users to many Posts with compound pk PostId,UserId. What I have to do is to drop uniqueness from PostId,UserId in order to allow to store more than one PostView per post per user.
Should I remove pk and add index for PostId,UserId (or add index first and then drop pk)?
Or Should I add id serial column, make it pk and then drop compound pk?

You do not explain the updatedAt column purpose so I would drop it, rename the createdAt as viewedAt and make it part of the PK:
constraint PostViews_pkey PRIMARY KEY (PostId, UserId, viewedAt)
The chance of the user viewing the post twice at the exact same time is insignificant. If you want to provide for that case then wrap the client insertion code in a try/catch and retry in case of a PK exception.
Do not use double quotes for identifiers. It will be a real pain forever.

Related

Efficiently enforcing a 1:1 relationship between two rows with foreign key constraints without creating redundant unique indexes

I have two PostgreSQL tables designed in the following way:
create type content_owner as enum (
'document',
'task'
);
create table content (
id serial not null primary key,
owner content_owner not null,
owner_document_id int references document(id) deferrable initially deferred,
owner_task_id int references task(id) deferrable initially deferred,
-- ...
constraint collab_content_owner_document
check (owner_document_id is null or (owner = 'document' and owner_document_id is not null)),
constraint collab_content_owner_task
check (owner_task_id is null or (owner = 'task' and owner_task_id is not null))
);
create table document (
id serial not null primary key,
content_id int not null references content(id),
-- ...
);
create table task (
id serial not null primary key,
content_id int not null references content(id),
-- ...
);
I want to enforce a 1:1 relationship at the database level for the document<->content relationship and the task<->content relationship.
Adding the following constraints accomplishes that:
alter table collab_content add foreign key (owner_document_id, id) references document (id, content_id) deferrable initially deferred;
alter table collab_content add foreign key (owner_task_id, id) references task (id, content_id) deferrable initially deferred;
alter table document add foreign key (content_id, id) references collab_content (id, owner_document_id);
alter table task add foreign key (content_id, id) references collab_content (id, owner_task_id);
Since I’m saying the ID pair should reference the same ID pair in the other table for both directions. However, this also requires me to create the following indexes:
alter table document add unique (id, content_id);
alter table task add unique (id, content_id);
alter table collab_content add unique (id, owner_document_id);
alter table collab_content add unique (id, owner_task_id);
These indexes feel pretty redundant given that there’s already a primary key on the id columns for these tables. It feels like PostgreSQL should be smart enough to be able to use the existing primary key constraint to make sure the foreign key constraints are met. Ideally I wouldn’t create a second, redundant, index on these tables for the purpose of these foreign key constraints.
Is there a way for me to avoid creating new unique indexes and instead tell PostgreSQL to only lookup the unique ID when resolving the foreign key?
Will PostgreSQL detect that these unique indexes are redundant (because the first column is the primary key) and not materialize a new index on disk for their purpose?
Is there a better way to enforce this constraint?
Two-way linking like this is a recipe for headaches. I recommend avoiding reference cycles if you can. In your case, the simplest way to store this information is to relax the constraint that there cannot be a content without a document or a task. Ask yourself, how might such a situation occur, how else could it be avoided, and what damage might it cause if it happens?
If we can remove that constraint, then we can have a very simple structure where document and task each have a content_id foreign key, and a unique index on it to ensure that no two documents have the same content.
If we can't remove that constraint, then the answers to your questions are:
There is no way to avoid creating those new unique indexes for the foreign keys. Foreign keys must have matching unique indexes.
Postgres will not detect that these indexes are redundant, and they will indeed be materialized and take up space.

SQL insert while checking that one of keys is equal to a field in another table

I'm using sequelize and postgresql but I think this is a more generic SQL/table question.
I have a setup similar to:
CREATE TABLE "Mixtime" (
id bigint NOT NULL,
duration character varying(255) NOT NULL,
created_at timestamp with time zone NOT NULL,
updated_at timestamp with time zone NOT NULL,
spell_id bigint NOT NULL,
ingredient_id bigint NOT NULL,
user_id bigint NOT NULL
);
CREATE TABLE "Spell" (
id bigint NOT NULL,
instructions character varying(5000) NOT NULL,
created_at timestamp with time zone NOT NULL,
user_id bigint NOT NULL
);
CREATE TABLE "Ingredient" (
id bigint NOT NULL,
ing_name character varying(255) NOT NULL,
created_at timestamp with time zone NOT NULL,
updated_at timestamp with time zone NOT NULL,
user_id bigint NOT NULL
);
CREATE TABLE "Users" (
id bigint NOT NULL,
user_name character varying(255) NOT NULL,
created_at timestamp with time zone NOT NULL,
updated_at timestamp with time zone NOT NULL,
);
ALTER TABLE ONLY "Mixtime"
ADD CONSTRAINT "Mixtime_pkey" PRIMARY KEY (id);
ALTER TABLE ONLY "Spell"
ADD CONSTRAINT "Spell_pkey" PRIMARY KEY (id);
ALTER TABLE ONLY "Ingredient"
ADD CONSTRAINT "Ingredient_pkey" PRIMARY KEY (id);
ALTER TABLE ONLY "Users"
ADD CONSTRAINT "Users_pkey" PRIMARY KEY (id);
ALTER TABLE ONLY "Mixtime"
ADD CONSTRAINT "Mixtime_spell_id_fkey" FOREIGN KEY (spell_id) REFERENCES "Spell"(id) ON UPDATE CASCADE ON DELETE CASCADE;
ALTER TABLE ONLY "Mixtime"
ADD CONSTRAINT "Mixtime_ingredient_id_fkey" FOREIGN KEY (ingredient_id) REFERENCES "Ingredient"(id) ON UPDATE CASCADE ON DELETE CASCADE;
ALTER TABLE ONLY "Mixtime"
ADD CONSTRAINT "Mixtime_user_id_fkey" FOREIGN KEY (user_id) REFERENCES "Users"(id) ON UPDATE CASCADE ON DELETE CASCADE;
ALTER TABLE ONLY "Spell"
ADD CONSTRAINT "Spell_user_id_fkey" FOREIGN KEY (user_id) REFERENCES "Users"(id) ON UPDATE CASCADE ON DELETE CASCADE;
ALTER TABLE ONLY "Ingredient"
ADD CONSTRAINT "Ingredient_user_id_fkey" FOREIGN KEY (user_id) REFERENCES "Users"(id) ON UPDATE CASCADE ON DELETE CASCADE;
What I'd like to do is make sure when I insert into Mixtime that user_id matches Spell & Ingredients' user_id fields
Pseudo code:
If (newMixtime.userId != Spell.user_id || newMixtime.userId != Ingredient.user_id) {
failHere
} else {
insert newMixtime into M
}
Note that this is not a join table. All three tables need to be query-able by the user_id field and table Mixtime has specific extra fields, it's just referencing Spell & Ingredient.
I could (and am currently) validating at the ORM layer by querying the db first, but this seems like something that should be possible in the DB layer and would save me trips.
If you know how to map this into Sequlize's Model syntax, that'd be grand, but I can probably figure that out if I have a pure postgres/SQL solution.
A fully normalized set of tables wouldn't have this constraint. The simplest way to represent this information is for every row in every table to be "owned" by exactly one row of exactly one parent table, as represented by a single foreign key. Unless you have a pressing need to share ingredients among different steps/spells of one user but NOT share them among different users, you don't need a user_id field on the ingredient.
Having true multi-table constraints (other than foreign keys) is extremely hard to do accurately and reliably. You can use triggers, but they tend to be very engine-specific, and I don't recommend putting your logic into triggers unless there's really no alternative.

Unique key with Empty values

This is the schema for the table, here defined a unique constraint with job_category_id, screening_type_id, test_id, sex.
CREATE TABLE job_profile
(
profile_id numeric(5,0) NOT NULL,
job_category_id numeric(5,0),
test_id numeric(5,0),
sex character(1),
default_yn character(1),
screening_type_id numeric(5,0),
CONSTRAINT job_profile_pkey PRIMARY KEY (profile_id),
CONSTRAINT fk_jobprofile_jobcate FOREIGN KEY (job_category_id)
REFERENCES job_category_mast (job_category_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT fk_jobprofile_test FOREIGN KEY (test_id)
REFERENCES test_mast (test_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT fk_prof_screentype FOREIGN KEY (screening_type_id)
REFERENCES screening_type_mast (screening_type_id) MATCH SIMPLE
ON UPDATE RESTRICT ON DELETE RESTRICT,
CONSTRAINT uk_job_test_sex_screening UNIQUE (job_category_id, screening_type_id, test_id, sex)
)
WITH (
OIDS=FALSE
);
ALTER TABLE job_profile
OWNER TO cdcis;
GRANT ALL ON TABLE job_profile TO cdcis;
GRANT SELECT, UPDATE, INSERT, DELETE ON TABLE job_profile TO cdcis_app;
If one field is empty then the unique constraint fails here.
How to add constraint so that one empty value is accepted, so it will be unique.
Can handle this scenario in the application using JPA?
According to the documentation:
Null values are not considered equal
So unique constraint won't work this way.
You have 2 options:
Create a trigger that will manually check the data integrity and deny changes if the table contains more than one empty value.
Set default value to 0 and NOT NULL constraint on these columns. In that case, you will be able to have only one row containing empty (zero) value.
Update:
As Abelisto suggested, it can be easily done with functional indexes.
CREATE UNIQUE INDEX uix_job_test_sex_screening on job_profile(coalesce(job_category_id, -1), screening_type_id, test_id, sex);

PostgreSQL assigns foreign key automatically on the referencing side

So I've got Table ActorInMovies, which has 3 foreign keys.
CREATE TABLE ActorInMovie(
ID_ROLE bigserial REFERENCES Role(ID_ROLE) ON DELETE CASCADE,
ID_ACTOR bigserial REFERENCES Actor(ID_Actor) ON DELETE CASCADE,
ID_MOVIE bigserial REFERENCES Movie(ID_Movie) ON DELETE CASCADE,
CONSTRAINT ActorInMovie_pk PRIMARY KEY (ID_ROLE));
I assumed that when I try to insert something like:
INSERT INTO ActorInMovie (ID_ROLE, ID_ACTOR) values (1,1);
that it would result in an error as ID_MOVIE was not specified (null I supposed).. but it automatically starts assigning indexes staring from 1.
What am I doing wrong? As written here, I thought that "PostgreSQL automatically creates indexes on primary keys and unique constraints, but not on the referencing side of foreign key relationships."
I have a very hard time imagining a use case where a serial(or bigserial) column references another column. It's usually the other way round: the serial column should go on the other end of the foreign key constraint.
I have an equally hard time imagining a design where a movie_id needs to be bigint instead of just int. There aren't nearly enough movies on this planet.
Also, there is a good chance, a column called movie_id in a table called actor_in_movie should be defined as NOT NULL.
In short: I doubt your design flies at all. Maybe something like:
CREATE TABLE actor (actor_id serial PRIMARY KEY, actor text, ...);
CREATE TABLE movie (movie_id serial PRIMARY KEY, movie text, ...);
CREATE TABLE actor_in_movie(
role_id serial PRIMARY KEY
,actor_id int NOT NULL REFERENCES actor(actor_id) ON DELETE CASCADE
,movie_id int NOT NULL REFERENCES movie(movie_id) ON DELETE CASCADE
);
A NOT NULL constraint is redundant, while the column is included in the primary key.
You probably want indices on actor_id and on movie_id in actor_in_movie.
More details:
How to implement a many-to-many relationship in PostgreSQL?
This is simply bigserial working exactly as advertised. It has nothing to do with the foreign key constraint, or with an index.

To prevent the use of duplicate Tags in a database

I would like to know how you can prevent to use of two same tags in a database table.
One said me that use two private keys in a table. However, W3Schools -website says that it is impossible.
My relational table
alt text http://files.getdropbox.com/u/175564/db/db7.png
My logical table
alt text http://files.getdropbox.com/u/175564/db/db77.png
The context of tables
alt text http://files.getdropbox.com/u/175564/db/db777.png
How can you prevent the use of duplicate tags in a question?
I have updated my NORMA model to more closely match your diagram. I can see where you've made a few mistakes, but some of them may have been due to my earlier model.
I have updated this model to prevent duplicate tags. It didn't really matter before. But since you want it, here it is (for Postgres):
START TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ WRITE;
CREATE SCHEMA so;
SET search_path TO SO,"$user",public;
CREATE DOMAIN so.HashedPassword AS
BIGINT CONSTRAINT HashedPassword_Unsigned_Chk CHECK (VALUE >= 0);
CREATE TABLE so."User"
(
USER_ID SERIAL NOT NULL,
USER_NAME CHARACTER VARYING(50) NOT NULL,
EMAIL_ADDRESS CHARACTER VARYING(256) NOT NULL,
HASHED_PASSWORD so.HashedPassword NOT NULL,
OPEN_ID CHARACTER VARYING(512),
A_MODERATOR BOOLEAN,
LOGGED_IN BOOLEAN,
HAS_BEEN_SENT_A_MODERATOR_MESSAGE BOOLEAN,
CONSTRAINT User_PK PRIMARY KEY(USER_ID)
);
CREATE TABLE so.Question
(
QUESTION_ID SERIAL NOT NULL,
TITLE CHARACTER VARYING(256) NOT NULL,
WAS_SENT_AT_TIME TIMESTAMP NOT NULL,
BODY CHARACTER VARYING NOT NULL,
USER_ID INTEGER NOT NULL,
FLAGGED_FOR_MODERATOR_REMOVAL BOOLEAN,
WAS_LAST_CHECKED_BY_MODERATOR_AT_TIME TIMESTAMP,
CONSTRAINT Question_PK PRIMARY KEY(QUESTION_ID)
);
CREATE TABLE so.Tag
(
TAG_ID SERIAL NOT NULL,
TAG_NAME CHARACTER VARYING(20) NOT NULL,
CONSTRAINT Tag_PK PRIMARY KEY(TAG_ID),
CONSTRAINT Tag_UC UNIQUE(TAG_NAME)
);
CREATE TABLE so.QuestionTaggedTag
(
QUESTION_ID INTEGER NOT NULL,
TAG_ID INTEGER NOT NULL,
CONSTRAINT QuestionTaggedTag_PK PRIMARY KEY(QUESTION_ID, TAG_ID)
);
CREATE TABLE so.Answer
(
ANSWER_ID SERIAL NOT NULL,
BODY CHARACTER VARYING NOT NULL,
USER_ID INTEGER NOT NULL,
QUESTION_ID INTEGER NOT NULL,
CONSTRAINT Answer_PK PRIMARY KEY(ANSWER_ID)
);
ALTER TABLE so.Question
ADD CONSTRAINT Question_FK FOREIGN KEY (USER_ID)
REFERENCES so."User" (USER_ID) ON DELETE RESTRICT ON UPDATE RESTRICT;
ALTER TABLE so.QuestionTaggedTag
ADD CONSTRAINT QuestionTaggedTag_FK1 FOREIGN KEY (QUESTION_ID)
REFERENCES so.Question (QUESTION_ID) ON DELETE RESTRICT ON UPDATE RESTRICT;
ALTER TABLE so.QuestionTaggedTag
ADD CONSTRAINT QuestionTaggedTag_FK2 FOREIGN KEY (TAG_ID)
REFERENCES so.Tag (TAG_ID) ON DELETE RESTRICT ON UPDATE RESTRICT;
ALTER TABLE so.Answer
ADD CONSTRAINT Answer_FK1 FOREIGN KEY (USER_ID)
REFERENCES so."User" (USER_ID) ON DELETE RESTRICT ON UPDATE RESTRICT;
ALTER TABLE so.Answer
ADD CONSTRAINT Answer_FK2 FOREIGN KEY (QUESTION_ID)
REFERENCES so.Question (QUESTION_ID) ON DELETE RESTRICT ON UPDATE RESTRICT;
COMMIT WORK;
Note that there is now a separate Tag table with TAG_ID as the primary key. TAG_NAME is a separate column with a uniqueness constraint over it, preventing duplicate tags. The QuestionTaggedTag table now has (QUESTION_ID, TAG_ID), which is also its primary key.
I hope I didn't go too far in answering this, but when I tried to write smaller answers, I kept having to untangle my earlier answers, and it seemed simpler just to post this.
You can create a unique constraint on (question_id, tag_name) in the tags table, which will ensure that the pair is unique. That would mean that the same question may not have the same tag attached more than once. However, the same tag could still apply to different questions.
You cannot create two primary keys, but you can place a uniqueness constraint on an index.
You can only have one primary key (I assume that's what you mean by "private" key), but that key can be a composite key consisting of the question-id and tag-name. In SQL, it would look like (depending on your SQL dialect):
CREATE TABLE Tags
(
question_id int,
tag_name varchar(xxx),
PRIMARY KEY (question_id, tag_name)
);
This will ensure you cannot have the same tag against the same question.
I will use PostgreSQL or Oracle.
I feel that the following is correspondent to Ken's code which is for MySQL.
CREATE TABLE Tags
(
QUESTION_ID integer FOREIGN KEY REFERENCES Questions(QUESTION_ID)
CHECK (QUESTION_ID>0),
TAG_NAME nvarchar(20) NOT NULL,
CONSTRAINT no_duplicate_tag UNIQUE (QUESTION_ID,TAG_NAME)
)
I added some extra measures to the query. For instance, CHECK (USER_ID>0) is to ensure that there is no corrupted data in the database.
I dropped out the AUTO_INCREMENT from this QUESTION_ID because I see that it would break our system, since one question cannot then have two purposely-selected tags. In other, tags would go mixed up.
I see that we need to give a name for the constraint. Its name is no_duplicate_tag in the command.