How to make an index with an optional FK? - sql

##Original question##
So the business model, which I didn't create, have an optional relationship (as in ER model). It's been a while since I've worked DB so I might be forgetting something. Currently the FK (Foreign Key) of the 1st table point to the PK (Primary Key) of the 2nd table, which is an ID ; I don't recall the term but it's the "fake" one, not the "real" one used by the RDBMS (Relational Database Management System). For simplicity, let's imagine there's only 2 tables.
Currently I'm having nulls in the FK column/attribute when there's no need for the optional relation. When there is an item in that column, I want the full advantages, checking if there's a matching item at the other side of the relationship, where the FK point to (2nd table), also triggers (although there are currently none) and other validations. I was satisfied up to not long ago when I realized I didn't want a duplicate on the important parts of the 1st table, so I wanted to create an unique key but it seems a key cannot be created which include a column/attribute that might contain null. So far there's 2 solutions proposed to me although I understand neither.
The 1st was that I'd put defaults, 0 for digit-based types and an empty string ('') for character-based types. What I don't get for that is that the 2nd table already has a row/tuple with a corresponding value (0). If I was to shift the current rows to not have a row with the default, I assume I then would put in the corresponding content a default too, in my case it's a character-based type. So the "cost" of enabling an index would be to make a multitude of useless joins then a multitude of useless merges by the software, in my case the database section of an office suite, Apache OpenOffice Base. This seem like a lot of added processing and it seem to me some kind of trigger, along with my current design, would be better.
The 2nd was to make a "linked" table (his/her term), a many-to-many relationship but I thought those were only for entries that had more than 1 possible relationship ; that having 0-1 relationship would not use it. And anyway, I'd still be confronted with the same problem, where there would not need to have an entry in that "linked" table. IIRC, the 2 "sides" of such table must contain valid candidate keys.
So the 1-1 relationship is already implemented for the cases where the business model do have the need for that option, with the current non-null entries in the FK. Now I just have to implement a method for the cases when the business model do not need the optional part, to allow for a 0-1 relationship, for the current null entries in the FK while not allowing duplicates.
##fredt request##
This now contain the 3rd example.
The followind sub-section contain a semi-SQL export from Apache OpenOffice Base using the command SCRIPT 'PATH\TO\NAME.sql'. The original file, along with its export, and its non-exported queries, are on How to make an index with an optional FK? example 3.
I'd like a unique key on the 3 columns/attributes ID_to_part1, model_number & ID_to_part2 ; however the original question, in the previous section, show HSQLDB version 1.8.0.10 won't allow a null to be contained in a column which is part of a unique key.
###HSQLDB export###
Producing some kind of SQL ; including non-standard statements.
SET DATABASE COLLATION "Latin1_General"
CREATE SCHEMA PUBLIC AUTHORIZATION DBA
CREATE CACHED TABLE "Table1"("ID" INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 0) NOT NULL PRIMARY KEY,"ID_to_part1" INTEGER NOT NULL,"model_number" VARCHAR_IGNORECASE(3) NOT NULL,"ID_to_part2" INTEGER)
CREATE CACHED TABLE "Table2"("ID" INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 0) NOT NULL PRIMARY KEY,"content" VARCHAR_IGNORECASE(1) NOT NULL)
CREATE CACHED TABLE "Table3"("ID" INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 0) NOT NULL PRIMARY KEY,"content" VARCHAR_IGNORECASE(1) NOT NULL)
ALTER TABLE "Table1" ADD CONSTRAINT SYS_FK_87 FOREIGN KEY("ID_to_part1") REFERENCES "Table3"("ID") ON DELETE CASCADE ON UPDATE CASCADE
ALTER TABLE "Table1" ADD CONSTRAINT SYS_FK_90 FOREIGN KEY("ID_to_part2") REFERENCES "Table2"("ID") ON DELETE SET NULL ON UPDATE CASCADE
ALTER TABLE "Table1" ALTER COLUMN "ID" RESTART WITH 15
ALTER TABLE "Table2" ALTER COLUMN "ID" RESTART WITH 2
ALTER TABLE "Table3" ALTER COLUMN "ID" RESTART WITH 4
CREATE USER SA PASSWORD ""
GRANT DBA TO SA
SET WRITE_DELAY 0 MILLIS
SET SCHEMA PUBLIC
INSERT INTO "Table1" VALUES(0,0,'123',0)
INSERT INTO "Table1" VALUES(1,1,'456',NULL)
INSERT INTO "Table1" VALUES(2,2,'789',0)
INSERT INTO "Table1" VALUES(3,0,'012',1)
INSERT INTO "Table1" VALUES(6,3,'345',NULL)
INSERT INTO "Table1" VALUES(7,1,'678',1)
INSERT INTO "Table1" VALUES(8,0,'123',NULL)
INSERT INTO "Table1" VALUES(9,0,'123',1)
INSERT INTO "Table1" VALUES(10,1,'456',0)
INSERT INTO "Table1" VALUES(11,1,'456',1)
INSERT INTO "Table1" VALUES(12,1,'456',0)
INSERT INTO "Table1" VALUES(13,1,'123',NULL)
INSERT INTO "Table1" VALUES(14,1,'123',0)
INSERT INTO "Table2" VALUES(0,'B')
INSERT INTO "Table2" VALUES(1,'E')
INSERT INTO "Table3" VALUES(0,'A')
INSERT INTO "Table3" VALUES(1,'C')
INSERT INTO "Table3" VALUES(2,'D')
INSERT INTO "Table3" VALUES(3,'F')
It seem queries weren't exported, here they are followed by their results
###Query1###
Joined main table:
SELECT "Table1"."ID", "Table3"."content" AS "Table3_content", "Table1"."model_number", "Table2"."content" AS "Table2_content"
FROM "Table1"
LEFT OUTER JOIN "Table2" ON "Table1"."ID_to_part2" = "Table2"."ID"
LEFT OUTER JOIN "Table3" ON "Table1"."ID_to_part1" = "Table3"."ID"
ORDER BY "ID" ASC
Result in:
ID Table3_content model_number Table2_content
0 A 123 B
1 C 456
2 D 789 B
3 A 012 E
6 F 345
7 C 678 E
8 A 123
9 A 123 E
10 C 456 B
11 C 456 E
12 C 456 B
13 C 123
14 C 123 B
###Query2###
The rows/tuples which 2 first part of the unique index could "break" the desired unique index should the 3rd also match. In other words, other rows aren't a threat (Query1 minus Query2).
SELECT *
FROM "Table1"
-- It seem HSQLDB won't support tuples as in WHERE (col1, col2) IN ( SELECT col1, col2 FROM
WHERE "ID_to_part1" IN (
SELECT "ID_to_part1"
FROM "Table1"
GROUP BY "ID_to_part1", "model_number"
HAVING COUNT(*) > 1
) AND "model_number" IN (
SELECT "model_number"
FROM "Table1"
GROUP BY "ID_to_part1", "model_number"
HAVING COUNT(*) > 1
)
ORDER BY "ID_to_part1" ASC, "model_number" ASC, "ID_to_part2" ASC, "ID" ASC
Result in:
ID ID_to_part1 model_number ID_to_part2
8 0 123
0 0 123 0
9 0 123 1
13 1 123
14 1 123 0
1 1 456
10 1 456 0
12 1 456 0
11 1 456 1
###Query3###
The rows/tuples which would "break" the desired unique index.
SELECT "Table1".*
FROM "Table1"
JOIN (
SELECT "ID_to_part1", "model_number", "ID_to_part2"
FROM "Table1"
GROUP BY "ID_to_part1", "model_number", "ID_to_part2"
HAVING COUNT(*) > 1
) AS "non_unique_model"
ON "Table1"."ID_to_part1"="non_unique_model"."ID_to_part1"
AND "Table1"."model_number"="non_unique_model"."model_number"
AND "Table1"."ID_to_part2"="non_unique_model"."ID_to_part2"
ORDER BY "ID_to_part1" ASC, "model_number" ASC, "ID_to_part2" ASC, "ID" ASC
Result in:
ID ID_to_part1 model_number ID_to_part2
10 1 456 0
12 1 456 0
###Beautified important tables schema###
CREATE CACHED TABLE "Table1"(
"ID" INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 0) NOT NULL PRIMARY KEY,
"ID_to_part1" INTEGER NOT NULL,
"model_number" VARCHAR_IGNORECASE(3) NOT NULL,
"ID_to_part2" INTEGER
)
CREATE CACHED TABLE "Table2"(
"ID" INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 0) NOT NULL PRIMARY KEY,
"content" VARCHAR_IGNORECASE(1) NOT NULL
)

Welcome to SO! I find your question a little hard to read.
EDIT:
CREATE TABLE table1 (
id INTEGER NOT NULL PRIMARY KEY,
data1 INTEGER NOT NULL
);
CREATE TABLE table2 (
id INTEGER NOT NULL PRIMARY KEY REFERENCES table1(id),
data2 INTEGER NOT NULL
);
There are records in table1. For each record in table1, there is zero or one corresponding record in table2.
This pattern is similar to table inheritance.
Further explanation:
This would allow you to have to following data.
id data1 id data2
---------- ---------
0 1234 0 42
1 5678 2 57
2 9012
See that the records in table1 with ids 0 and 2 have corresponding records in table2. The record with id 1 does not.
P.S.
Note that you also could combine things into one table. Whether this is advisable depends on your situation.
CREATE TABLE table1 (
id INTEGER NOT NULL PRIMARY KEY,
data1 INTEGER NOT NULL,
data2 INTEGER NULL
);

I wanted to create an unique key but it seems a key cannot be created
which include a column that might contain null.
My understanding is that you have a FK on which you want to build an index on to enhance performance and that FK may contain nulls (as in #Paul Draper's solution).
I am no expert in HSQLDB, but the user guide, under the Constraints section says:
"Since version 1.7.2 the behaviour of UNIQUE constraints and indexes with respect to NULL values has changed to conform to SQL standards. A row, in which the value for any of the UNIQUE constraint columns is NULL, can always be added to the table. So multiple rows can contain the same values for the UNIQUE columns if one of the values is NULL."
I understand this to mean that you can build an index on the FK in version 1.7.2 of the database even if the column conain rows with the FK value is Null.

Your question was:
I didn't want a
duplicate on the important parts of the 1st table, so I wanted to
create an unique key but it seems a key cannot be created which
include a column that might contain null.
You don't want a duplicate on the "important parts" in Table1 but it is not clear which parts must be unique. Assuming the "important parts" are some of these three columns:
"ID_to_part1" INTEGER,"model_number" VARCHAR_IGNORECASE(3) NOT NULL,"ID_to_part2" INTEGER
A) If you create a unique constraint on "model_number", which is by definition NOT NULL:
CONSTRAINT UNIQUE ("model_number")
Then model_number values are unique but two different models can have the same ID_to_part1
B) In addition to (A) you can have this constraint:
CONSTRAINT UNIQUE ("model_number", "ID_to_part1")
Then each model_number will correspond to a unique ID_to_part1. If you don't have NOT NULL on ID_to_part1, then it can contain NULL for those model_number values that do not have an extra part.
C) In addition to (A) you can have this:
CONSTRAINT UNIQUE ("model_number", "ID_to_part2")
Which has the same effect as (B) but for the ID_to_part2 column.
Your SELECT statement is correct. It shows all models with any optional information they may have.
In short, you can have a UNIQUE constraint on columns that can have NULL in them. But the UNIQUE constraint on model_number is also required.
Edit:
The OP has edited the question again with the requirement that "model_number" is not unique, only the three columns together are unique while some of them can store NULL and the NULL cannot be repeated. This is not possible to achieve with HSQLDB 1.8. In HSQLDB 2.x there is a setting for SET DATABASE SQL UNIQUE NULLS which can be changed to FALSE to allow this. In this case only one UNIQUE constraints on the three columns is needed.

Related

How to inherit generated column in Postgres

I have 3 tables as follows:
CREATE TABLE A (
ID integer PRIMARY KEY generated always as identity,
data integer
);
CREATE TABLE B (
other_data integer
) INHERITS (A);
CREATE TABLE C (
other_other_data integer
) INHERITS (A);
My intent is to have unique id for tables B and C so they don't mix up. When trying this approach and inserting data into B (insert into B (other_data) values (1)) i get the following error:
ERROR: null value in column "id" of relation "B" violates not-null constraint
DETAIL: Failing row contains (null)
I suspect this approach is not the correct way to make 2 tables share unique ids. And Postgres documentation really sucks. I tried using serial for a long time. Only to find out that buried in the wiki is a warning to not use them.
-edit: added the insert statement

How to insert into a table considering that table has a Primary Key already in it?

I have two tables A and B and need to insert records (few columns not all) from A into B.
Of course I guess I can do:
INSERT INTO B (col2)
SELECT DISTINCT col2
FROM A
However, Col1 in table B (named ID) has a type of INT so it is causing this error:
Msg 515, Level 16, State 2, Line 1
Cannot insert the value NULL into column 'ID', table 'MyDB.dbo.Visitor'; column does not allow nulls. INSERT fails.
How can I make SQL Server ignore this column and just insert the data I need?
Thanks.
A primary key must fulfill two conditions:
It must be unique in the table (that is, any row in the table can be identified by its primary key), and
The fields that are part of the primary key cannot be NULL. That's because allowing NULL values in the primary key will make the uniqueness of the primary key impossible to hold, because a NULL value is non-equal to any other value, including another NULL.
Quoting from here:
A unique key constraint does not imply the NOT NULL constraint in practice. Because NULL is not an actual value (it represents the lack of a value), when two rows are compared, and both rows have NULL in a column, the column values are not considered to be equal. Thus, in order for a unique key to uniquely identify each row in a table, NULL values must not be used.
This should work assuming you don't insert duplicates into B:
INSERT INTO B (col2)
SELECT DISTINCT col2
FROM A
WHERE col2 IS NOT NULL
Set ID column in table B to "auto-increment".
SQL Server will provide automatically unique values for ID column if you define it as IDENTITY
In your case you can calculate the maximum value of ID column and start IDENTITY from the value that exceeds that maximum.
See the accepted answer for SQL Server, How to set auto increment after creating a table without data loss? for such code.
You need to create a relationship between the two tables and do an update statement.
Update table b set valueb = valuea from table a where a.id = b.id
You also need to rethink your design a little bit it sounds like.

how to keep combination of cells unique

i have table A and table B. I have a bridge table called tableC
in table C i have:
ID
tableA_ID
tableB_ID
ID is the primary key.
i also want to enforce the combination of tableA_ID and tableB_ID to be unique so there are no duplicate records.
how do i enforce this?
create unique index myIdx on tableC(tableA_ID, tableB_ID)
or whatever the syntax for your particular database system is.
Make the PRIMARY KEY tableA_ID and tableB_ID, EXCLUDING ID
lets say we have a table TABLEA with values
tableAID
1
2
3
and table TABLEB with values
tableBID
4
5
6
making the primary key (ID, tableA_ID, tableB_ID) will not work eg.
ID | tableAID | tableBID
1 | 1 | 4
2 | 1 | 4
will work fine with the above pk, but you need PRIMARY KEY (tableA_ID, tableB_ID)
Drop the ID column then make the other two columns the primary key and their uniqueness will be enforced by the database server.
It's not really necessary to have the ID column - even though it serves as a handy way of referencing a particular record - as the uniqueness of the other two columns will mean that they are sufficient to reference a particular record.
You may also want to put an index on this table, that includes bothe columns, to make access faster.

Set based insert into two tables with 1 to 0-1 relation

I have two tables, the first has a primary key that is an identity, the second has a primary key that is not, but that key has a foreign key constraint back to the first table's primary key.
If I am inserting one record at a time I can use the Scope_Identity to get the value for the pk just inserted in table 1 that I want to insert into the second table.
My problem is I have many records coming from selects I want to insert in both tables, I've not been able to think of a set based way to do these inserts.
My current solution is to use a cursor, insert in the first table, get key using scope_identity, insert into second table, repeat.
Am I missing a non-cursor solution?
Yes, Look up the output clause in Books online.
I had this problem just this week: someone had introduced a table with a meaningless surrogate key into the schema where naturally keys are used. No doubt I'll fix this soon :) until then, I'm working around it by creating a table of data to INSERT from: this could be a permanent or temporary base table or a derived table (see below), which should suit your desire for a set-based solution anyhow. Use a join between this table and the table with the IDENTITY column on the natural key to find out the auto-generated values. Here's a brief example:
CREATE TABLE Test1
(
surrogate_key INTEGER IDENTITY NOT NULL UNIQUE,
natural_key CHAR(10) NOT NULL CHECK (natural_key NOT LIKE '%[^0-9]%') UNIQUE
);
CREATE TABLE Test2
(
surrogate_key INTEGER NOT NULL UNIQUE
REFERENCES Test1 (surrogate_key),
data_col INTEGER NOT NULL
);
INSERT INTO Test1 (natural_key)
SELECT DT1.natural_key
FROM (
SELECT '0000000001', 22
UNION ALL
SELECT '0000000002', 55
UNION ALL
SELECT '0000000003', 99
) AS DT1 (natural_key, data_col);
INSERT INTO Test2 (surrogate_key, data_col)
SELECT T1.surrogate_key, DT1.natural_key
FROM (
SELECT '0000000001', 22
UNION ALL
SELECT '0000000002', 55
UNION ALL
SELECT '0000000003', 99
) AS DT1 (natural_key, data_col)
INNER JOIN Test1 AS T1
ON T1.natural_key = DT1.natural_key;

How to create a unique index on a NULL column?

I am using SQL Server 2005. I want to constrain the values in a column to be unique, while allowing NULLS.
My current solution involves a unique index on a view like so:
CREATE VIEW vw_unq WITH SCHEMABINDING AS
SELECT Column1
FROM MyTable
WHERE Column1 IS NOT NULL
CREATE UNIQUE CLUSTERED INDEX unq_idx ON vw_unq (Column1)
Any better ideas?
Using SQL Server 2008, you can create a filtered index.
CREATE UNIQUE INDEX AK_MyTable_Column1 ON MyTable (Column1) WHERE Column1 IS NOT NULL
Another option is a trigger to check uniqueness, but this could affect performance.
The calculated column trick is widely known as a "nullbuster"; my notes credit Steve Kass:
CREATE TABLE dupNulls (
pk int identity(1,1) primary key,
X int NULL,
nullbuster as (case when X is null then pk else 0 end),
CONSTRAINT dupNulls_uqX UNIQUE (X,nullbuster)
)
Pretty sure you can't do that, as it violates the purpose of uniques.
However, this person seems to have a decent work around:
http://sqlservercodebook.blogspot.com/2008/04/multiple-null-values-in-unique-index-in.html
It is possible to use filter predicates to specify which rows to include in the index.
From the documentation:
WHERE <filter_predicate> Creates a filtered index by specifying which
rows to include in the index. The filtered index must be a
nonclustered index on a table. Creates filtered statistics for the
data rows in the filtered index.
Example:
CREATE TABLE Table1 (
NullableCol int NULL
)
CREATE UNIQUE INDEX IX_Table1 ON Table1 (NullableCol) WHERE NullableCol IS NOT NULL;
Strictly speaking, a unique nullable column (or set of columns) can be NULL (or a record of NULLs) only once, since having the same value (and this includes NULL) more than once obviously violates the unique constraint.
However, that doesn't mean the concept of "unique nullable columns" is valid; to actually implement it in any relational database we just have to bear in mind that this kind of databases are meant to be normalized to properly work, and normalization usually involves the addition of several (non-entity) extra tables to establish relationships between the entities.
Let's work a basic example considering only one "unique nullable column", it's easy to expand it to more such columns.
Suppose we the information represented by a table like this:
create table the_entity_incorrect
(
id integer,
uniqnull integer null, /* we want this to be "unique and nullable" */
primary key (id)
);
We can do it by putting uniqnull apart and adding a second table to establish a relationship between uniqnull values and the_entity (rather than having uniqnull "inside" the_entity):
create table the_entity
(
id integer,
primary key(id)
);
create table the_relation
(
the_entity_id integer not null,
uniqnull integer not null,
unique(the_entity_id),
unique(uniqnull),
/* primary key can be both or either of the_entity_id or uniqnull */
primary key (the_entity_id, uniqnull),
foreign key (the_entity_id) references the_entity(id)
);
To associate a value of uniqnull to a row in the_entity we need to also add a row in the_relation.
For rows in the_entity were no uniqnull values are associated (i.e. for the ones we would put NULL in the_entity_incorrect) we simply do not add a row in the_relation.
Note that values for uniqnull will be unique for all the_relation, and also notice that for each value in the_entity there can be at most one value in the_relation, since the primary and foreign keys on it enforce this.
Then, if a value of 5 for uniqnull is to be associated with an the_entity id of 3, we need to:
start transaction;
insert into the_entity (id) values (3);
insert into the_relation (the_entity_id, uniqnull) values (3, 5);
commit;
And, if an id value of 10 for the_entity has no uniqnull counterpart, we only do:
start transaction;
insert into the_entity (id) values (10);
commit;
To denormalize this information and obtain the data a table like the_entity_incorrect would hold, we need to:
select
id, uniqnull
from
the_entity left outer join the_relation
on
the_entity.id = the_relation.the_entity_id
;
The "left outer join" operator ensures all rows from the_entity will appear in the result, putting NULL in the uniqnull column when no matching columns are present in the_relation.
Remember, any effort spent for some days (or weeks or months) in designing a well normalized database (and the corresponding denormalizing views and procedures) will save you years (or decades) of pain and wasted resources.