Proper way to make a relation between multiple rows of single table - sql

I've got following situation: I want to connect multiple records from one table with some kind of relation. Record could have no connection to other, or could have multiple of them (1 or more). There is no hierarchy in this relation.
For example:
CREATE TABLE x
(
x_id SERIAL NOT NULL PRIMARY KEY,
data VARCHAR(10) NOT NULL
);
I've thought of two ideas:
Make a new column in this table, which will contain some relationId. It won't reference anything. When new record is inserted, I will generate new relationId and put it there. If I would want to connect other record with this one, I will simply put the same relationId.
Example:
CREATE TABLE x
(
x_id NUMBER(19, 0) NOT NULL PRIMARY KEY,
data VARCHAR(10) NOT NULL,
relation_id NUMBER(19, 0) NOT NULL
);
insert into x values (nextval, 'blah', 1);
insert into x values (nextval, 'blah2', 1);
It will connect these two rows.
pros:
very easy
easy queries to get all records connected to particular record
no overhead
cons:
hibernate entity will contain only relationId, no collection of
related records (or maybe it's possible somehow?)
Make a separate join table, and connect rows with many-to-many relation. Join table would contain two column with ids, so one entry would connect two rows.
Example:
CREATE TABLE x
(
x_id SERIAL NOT NULL PRIMARY KEY,
data VARCHAR(10) NOT NULL
);
CREATE TABLE bridge_x
(
x_id1 NUMBER(19, 0) NOT NULL REFERENCES x (x_id),
x_id2 NUMBER(19, 0) NOT NULL REFERENCES x (x_id),
PRIMARY KEY(x_id1, x_id2)
);
insert into x values (1, 'blah');
insert into x values (2, 'blah2');
insert into bridge_x values (1, 2);
insert into bridge_x values (2, 1);
pros:
normalized relation
easy hibernate entity mapping, with collection containing related
records
cons:
overhead (with multiple connected rows, every pair must be inserted)
What is the best way to do this? Is there any other way than these two?

The best way in my experience is to use normalization as you've said in your second option. What you are looking for here is to create a foreign key.
So if you use the example you've given in example 2 and then apply the following SQL statement, you will create a relational database that can have 0 to many relations.
ALTER TABLE `bridgex` ADD CONSTRAINT `fk_1` FOREIGN KEY (`xID`) REFERENCES `x`(`xID`) ON DELETE NO ACTION ON UPDATE NO ACTION;

Related

How to insert values into a junction/linking table in SQL Server?

I am piggy backing off this question regarding creating a junction/linking table. It is clear how to create a junction table, but I am concerned about how to fill the junction table with data. What is the simplest and/or best method for filling out the junction table (movie_writer_junction) with data between two other tables (movie, writer)
CREATE TABLE movie
(
movie_id INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
movie_name NVARCHAR(100),
title_date DATE
);
CREATE TABLE writer
(
writer_id INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
writer_name NVARCHAR(100),
birth_date DATE
);
INSERT INTO movie
VALUES ('Batman', '2015-12-12'), ('Robin', '2016-12-12'),
('Charzard, the movie', '2018-12-12')
INSERT INTO writer
VALUES ('Christopher', '1978-12-12'), ('Craig', '1989-12-12'),
('Ash', '1934-12-12')
CREATE TABLE movie_writer_junction
(
movie_id INT,
writer_id INT,
CONSTRAINT movie_writer_pk
PRIMARY KEY(movie_id, writer_id),
CONSTRAINT movie_id_fk
FOREIGN KEY(movie_id) REFERENCES movie(movie_id),
CONSTRAINT writer_fk
FOREIGN KEY(writer_id) REFERENCES writer(writer_id)
);
The final junction table is currently empty. This is a simple example, and you can manually fill the data into the junction table, but if I have two tables with millions of rows, how is something like this completed?
Hi I'm guessing this relates to the fact that you can't rely on the Identity Columns being the same in different regions.
You can write your inserts as a cross join from the 2 src tables
Insert junc_table (writer_id, movie_id)
Select writer_id , movie_id
from writer
CROSS Join
movie
where writer_name = 'Tolkien' and movie_name = 'Lord of the Ring'
This way you always get the correct Surrogate Key (the identity) from both tables.
Its pretty easy to generate a SQL statement for all your existing junction combinations using a bit of Dynamic SQL
Another Approach is to Use SET IDENTITY_INSERT ON - but this needs to be done when loading the 2 other tables and that ship may already have sailed!

Condidate for Check Constraint

I have a table that holds Tasks for a particular person.
TaskID INT PK
PersonID INT (FK to Person Table)
TaskStatusID INT (FK To list of Statuses)
Deleted DATETIME NULL
The business rule is that a person can not have more than one active task at a time. A task is 'Active' based on it's TaskStatusID. The statuses are:
'5=New, 6=In 7=Progress, 8=Under 9=Review, 10=Complete, 11=Cancelled'
These are values in my Status table.
So, 5,6,7,8 and 9 are Active tasks. These rest are finalised.
A person can only have one task which is in an active state.
So, to test if I can add a task for this person, I would do:
CASE EXISTS(SELECT * FROM Task WHERE PersonID = 123 AND TaskStatusIN IN (5,6,7,8,9)) THEN 0 ELSE 1 END AS CanAdd
The table has a lot of rows. Around 200,000.
I was thinking of adding a Check Constraint on this table, so on update/insert, I make that query to see if the row being added/edited will break the data integrity with regards the business rules.
Is a check constraint suitable for this, or is there a more efficient way to keep the data integral.
Something like:
ADD CONSTRAINT chk_task CHECK (
EXISTS(SELECT * FROM Task WHERE PersonID = ?? AND TaskStatusIN IN (5,6,7,8,9)))
You can't easily do it with a check constraint because they only (naturally) can make assertions about columns within the same row. There are some kludgy ways to get around that by using a UDF to query other rows but most implementations I've seen have odd edge cases where it's possible to work around the UDF and end up with invalid rows after all.
What you can do is to create an indexed view that maintains the constraint:
create table dbo.Tasks (
TaskID INT not null primary key,
PersonID INT not null,
TaskStatusID INT not null,
Deleted DATETIME NULL
)
go
create view dbo.DRI_Tasks_OneActivePerPerson
with schemabinding
as
select PersonID from dbo.Tasks
where TaskStatusID IN (5,6,7,8,9)
go
create unique clustered index UX_DRI_Tasks_OneActivePerPerson
on dbo.DRI_Tasks_OneActivePerPerson (PersonID)
And now this insert succeeds (because there's only one row with an active status for person 1:
insert into dbo.Tasks (TaskID,PersonID,TaskStatusID)
values (1,1,5),(2,1,1),(3,1,4)
But this insert fails:
insert into dbo.Tasks (TaskID,PersonID,TaskStatusID)
values (4,2,6),(5,2,8)
With the message:
Cannot insert duplicate key row in object 'dbo.DRI_Tasks_OneActivePerPerson'
with unique index 'UX_DRI_Tasks_OneActivePerPerson'.
The duplicate key value is (2).
If you are using SQL Server 2008 or later version, you could create a unique filtered index:
CREATE UNIQUE INDEX UQ_ActiveStatus
ON dbo.Task (PersonID)
WHERE TaskStatusID IN (5, 6, 7, 8, 9);
It would act as a unique constraint specifically for rows with the specified statuses. You would only be able to have one of the specified statuses per person.
You can use above check constraint, but the best methodology I will suggest good to write dml trigger, before insert/before update, that one raise the statement.

Two Primary keys that are the same, but in two different tables

I just have a quick question for ya's about primary keys in SQL. I have a primary key in one table (Patient) and another table (Facility) with a different primary key. What I want to do is connect them so I have my primary key from Patient and have that exact primary key (with data) in my Facility table. How do I go about doing this? Thanks for any help in advance, it is greatly appreciated!
Add an other table ( eg hospitalization ) that contains both keys:
create table hospitalization (
patient_id int not null,
facility_id int not null,
date_start date not null,
date_end date
);
this is a standard many to many relation with properties and means that a patient could be hospitalized many times and each facility could have many patients.
This is an interesting kind of relation. But you can do it inserting the same id to both tables:
INSERT INTO Parient(ID, NAME) VALUES (5, 'Mike');
INSERT INTO Facility(ID, LOCATION) VALUES (5, 'San Francisco');
You can alos use a sequence for the first insert and then use generated new id for the second insert (current value).
Note: I do not recommend this practice of ID synchronization. The better way to go is to let your database assign unique IDs for us (using sequence of auto-increment) and then define foreign key constraint adding FACILITY_ID to the Patient table or PATIENT_ID to the Facility table implementing one-to-one relationship.

SQL Server 2005 UNIQUE Constraint on Multiple Column while creating table

I am working with a SQL Server 2005 database and I am facing a problem.
I am creating a table like this:
CREATE TABLE CONT_UNIQUE
(
NUM INT,
BRANCH VARCHAR(10),
PIN INT,
CONSTRAINT CON UNIQUE(NUM,BRANCH,PIN)
)
means I am adding a unique constraint to all columns present in my table. But while inserting values in table, it is considering only NUM to be as UNIQUE, but allowing duplicate values for branch and PIN.
Below are my two insert queries.
INSERT INTO CONT_UNIQUE VALUES(1, 'MP', 123) -> Working fine
INSERT INTO CONT_UNIQUE VALUES(2, 'MP', 123) -> Should throw error since MP, and 123 are present.
Note:
CREATE TABLE CONT_UNIQUE
(
NUM INT UNIQUE ,
BRANCH VARCHAR(10), UNIQUE,
PIN INT UNIQUE
)
this works perfectly as expected.
Kindly let me know what is the problem with my queries.
You have created a single constraint that ensures no two rows have the same values in all 3 columns.
You want three separate constraints, one on NUM, one on BRANCH and one on PIN.
CREATE TABLE CONT_UNIQUE
(
NUM INT,
BRANCH VARCHAR(10),
PIN INT,
CONSTRAINT CON UNIQUE(NUM),
CONSTRAINT CON2 UNIQUE(BRANCH),
CONSTRAINT CON3 UNIQUE(PIN)
)
You have created unique constraint on the combination of 3 columns but not 2 columns, what I mean is you can not insert 1,'MP',123 value again into the table, but you can insert 1,'MP',12 or 1,'MP',13 into the table.
That won't throw an error because the unique constraint is on all 3 columns.
I think you want this as well / instead:
... CONSTRAINT only_two_columns UNIQUE (branch, pin) ...
From all of your replies i learnt that,
1)Unique key works with Unique combination rather than focusing on individual uniqueness...
Ex: unique(Column1, Column2) means column1 and column2 combination should not repeat, but individual values can repeat.
2)If we want unique value on each column then we need to mention the "unique" to each column while creating table.
Ex:Num int unique, Branch varchar(10) unique...etc so that each column will have unique values.
Previously i thought Unique(Col1, col2) is same as "col1 int unique, col2 int unique". So i asked the question.
ONCE AGAIN THANKS TO ALL OF YOU FOR YOUR SUPPORT IN SOLVING MY QUERY.. :)
Thanks
Mahesh

How to create a unique index on a NULL column?

I am using SQL Server 2005. I want to constrain the values in a column to be unique, while allowing NULLS.
My current solution involves a unique index on a view like so:
CREATE VIEW vw_unq WITH SCHEMABINDING AS
SELECT Column1
FROM MyTable
WHERE Column1 IS NOT NULL
CREATE UNIQUE CLUSTERED INDEX unq_idx ON vw_unq (Column1)
Any better ideas?
Using SQL Server 2008, you can create a filtered index.
CREATE UNIQUE INDEX AK_MyTable_Column1 ON MyTable (Column1) WHERE Column1 IS NOT NULL
Another option is a trigger to check uniqueness, but this could affect performance.
The calculated column trick is widely known as a "nullbuster"; my notes credit Steve Kass:
CREATE TABLE dupNulls (
pk int identity(1,1) primary key,
X int NULL,
nullbuster as (case when X is null then pk else 0 end),
CONSTRAINT dupNulls_uqX UNIQUE (X,nullbuster)
)
Pretty sure you can't do that, as it violates the purpose of uniques.
However, this person seems to have a decent work around:
http://sqlservercodebook.blogspot.com/2008/04/multiple-null-values-in-unique-index-in.html
It is possible to use filter predicates to specify which rows to include in the index.
From the documentation:
WHERE <filter_predicate> Creates a filtered index by specifying which
rows to include in the index. The filtered index must be a
nonclustered index on a table. Creates filtered statistics for the
data rows in the filtered index.
Example:
CREATE TABLE Table1 (
NullableCol int NULL
)
CREATE UNIQUE INDEX IX_Table1 ON Table1 (NullableCol) WHERE NullableCol IS NOT NULL;
Strictly speaking, a unique nullable column (or set of columns) can be NULL (or a record of NULLs) only once, since having the same value (and this includes NULL) more than once obviously violates the unique constraint.
However, that doesn't mean the concept of "unique nullable columns" is valid; to actually implement it in any relational database we just have to bear in mind that this kind of databases are meant to be normalized to properly work, and normalization usually involves the addition of several (non-entity) extra tables to establish relationships between the entities.
Let's work a basic example considering only one "unique nullable column", it's easy to expand it to more such columns.
Suppose we the information represented by a table like this:
create table the_entity_incorrect
(
id integer,
uniqnull integer null, /* we want this to be "unique and nullable" */
primary key (id)
);
We can do it by putting uniqnull apart and adding a second table to establish a relationship between uniqnull values and the_entity (rather than having uniqnull "inside" the_entity):
create table the_entity
(
id integer,
primary key(id)
);
create table the_relation
(
the_entity_id integer not null,
uniqnull integer not null,
unique(the_entity_id),
unique(uniqnull),
/* primary key can be both or either of the_entity_id or uniqnull */
primary key (the_entity_id, uniqnull),
foreign key (the_entity_id) references the_entity(id)
);
To associate a value of uniqnull to a row in the_entity we need to also add a row in the_relation.
For rows in the_entity were no uniqnull values are associated (i.e. for the ones we would put NULL in the_entity_incorrect) we simply do not add a row in the_relation.
Note that values for uniqnull will be unique for all the_relation, and also notice that for each value in the_entity there can be at most one value in the_relation, since the primary and foreign keys on it enforce this.
Then, if a value of 5 for uniqnull is to be associated with an the_entity id of 3, we need to:
start transaction;
insert into the_entity (id) values (3);
insert into the_relation (the_entity_id, uniqnull) values (3, 5);
commit;
And, if an id value of 10 for the_entity has no uniqnull counterpart, we only do:
start transaction;
insert into the_entity (id) values (10);
commit;
To denormalize this information and obtain the data a table like the_entity_incorrect would hold, we need to:
select
id, uniqnull
from
the_entity left outer join the_relation
on
the_entity.id = the_relation.the_entity_id
;
The "left outer join" operator ensures all rows from the_entity will appear in the result, putting NULL in the uniqnull column when no matching columns are present in the_relation.
Remember, any effort spent for some days (or weeks or months) in designing a well normalized database (and the corresponding denormalizing views and procedures) will save you years (or decades) of pain and wasted resources.