What is the difference between check and foreign key? - sql

i am quite confused about the difference between a FOREIGN KEY and CHECK constraint - they appear to me to achieve the same result.
I mean I could create a table and enforce a Foreign key on another table, but i could create a CHECK to ensure the value in in another table.
What is the difference and when to use the one or the other?

A FOREIGN KEY constrain ensures that the entry DOES EXISTS in
EDIT
another table
as per correct comment Exists in another table... or the same table. – Mark Byers
A CHECK constrain ensures that the entry follows some rule.
CHECK Constraints
CHECK constraints enforce domain integrity by limiting the values that are accepted by a column. They are similar to FOREIGN KEY constraints in that they control the values that are put in a column. The difference is in how they determine which values are valid: FOREIGN KEY constraints obtain the list of valid values from another table, and CHECK constraints determine the valid values from a logical expression that is not based on data in another column.

A foreign key constraint is more powerful than a CHECK constraint.
A foreign key constraint means that the column (in the current table) can only have values that already exist in the column of the foreign table (which can include the be the same table, often done for hierarchical data). This means that as the list of values changes - gets bigger or smaller - there's no need to update the constraint.
A check constraint can not reference any columns outside of the current table, and can not contain a subquery. Often, the values are hard coded like BETWEEN 100 and 999 or IN (1, 2, 3). This means that as things change, you'll have to update the CHECK constraint every time. Also, a foreign key relationship is visible on an Entity Relationship Diagram (ERD), while a CHECK constraint will never be. The benefit is that someone can read the ERD and construct a query from it without using numerous DESC table commands to know what columns are where and what relates to what to construct proper joins.
Best practice is to use foreign keys (and supporting tables) first. Use CHECK constraints as a backup for situations where you can't use a foreign key, not as the primary solution to validate data.

It depends on your DBMS (which you didn't specify), but in one sense, you are correct: a foreign key constraint is a particular case of a check constraint. There are DBMS which would not allow you to formulate a foreign key constraint as a check constraint.
The main intention of a check constraint is to describe conditions that apply to a single row in the table. For example, I have a table of elements (as in Hydrogen, Helium, ...) and the symbols for the elements are constrained to start with an upper-case letter and are followed by zero, one or two lower-case letters (two lower-case letters for as yet undiscovered but predicted elements: Uus - ununseptium (117), which has just been isolated but has yet to be named). This can be the subject of a CHECK constraint:
CHECK(Symbol MATCHES "[A-Z][a-z]{0,2}")
assuming MATCHES exists and supports an appropriate regular expression language.
You can also have check constraints that compare values:
CHECK(OrderDate <= ShipDate OR ShipDate IS NULL)
To express a foreign key constraint as a check constraint, you have to be permitted to execute a query in the CHECK clause. Hypothetically:
CHECK(EXISTS(SELECT * FROM SomeTable AS s
WHERE ThisTable.pk_col1 = s.pk_col1 AND
ThisTable.pk_col2 = s.pk_col2))
This example shows some of the problems. I don't have a convenient table alias for the table in which I'm writing the check constraint - I assumed it was 'ThisTable'. The construct is verbose. Assuming that the primary key on SomeTable is declared on columns pk_col1 and pk_col2, then the FOREIGN KEY clause is much more compact:
FOREIGN KEY (pk_col1, pk_col2) REFERENCES SomeTable
Or, if you are referencing an alternative key, not the primary key:
FOREIGN KEY (pk_col1, pk_col2) REFERENCES SomeTable(ak_col1, ak_col2)
This is notationally more compact - so there is less chance of getting it wrong - and can be special-cased by the server because the special notation means it knows that it is dealing with a foreign key constraint whereas the general check clause has to be scrutinized to see if it matches one of many possible forms that are equivalent to the foreign key.
The question asks: when to use a check constraint and when to use a foreign key constraint?
Use a CHECK constraint to specify criteria that can be checked in a single row.
Use a FOREIGN KEY constraint to specify that the values in the current row must match the values of a row in some other unique key (a candidate key, usually the primary key rather than an alternative key) of some table - which may be the same table or (more usually) a different table.

Consider a scenario like this:
Table A has a keyword column, and the value must be among thousand of keywords provided.
How would you like to implement the constraint?
Hard coded check condition like check (keyword in ('a', 'b', 'c' .......)) or simply import the provided keywords as another table and set a foreign key constraint to keyword column of Table A.

Related

Does adding a foreign key to a table affect its insertion time?

Is the assumption that each foreign key added a to a table also adds a CHECK constraint that ensures that values inserted in the foreign key column is from the set of values from the table where that key is the primary key.
This would imply that a table with more foreign keys would take longer to insert a value into. Is this correct?
I am using Microsoft SQL Server 2014.
Yes. Foreign key relationships are checked when data is inserted or modified in the table.
The foreign key needs to be to a primary key or unique key. This guarantees that an index is available for the check.
In general, looking up the value in the index should be pretty fast. Faster than the other things that are going on in an insert, such as finding a free page for the data and logging the data.
However, validating the foreign key is going to add some overhead.
Don't mix up foreign keys and checks - there are two different constraint types. For example check accepts nulls and foreign keys not (exception: on delete set null fk option).
When rows are inserted/updated in database set od step is beeing executed, e.g. checking existance of tables, columns, veryfing privileges. Where you have fk database engine must verify contraint before inserting/updateing data to the table - it's additional step to execute.
I have never expirienced situation, when fk painfully slowed down the database operations duration.

How to know when to create a composite constraint?

I am currently learning SQL, and I have a physical data model I need to implement in code. However, during constraint creation, the numbers appearing next to FK and U started confusing me immensely. Consider the diagram. EDIT: Added the full physical model.
I know that when the matter is Primary Keys, we must have a single PK Constraint that's all the columns marked as PK. However, when the thing is FK or Unique constraints, I'm not so sure myself.
Let's assume I want to create the FK constraints for the table Opcao.
Should I create a single constraint for multiple columns, referencing their respective columns like this:
ALTER TABLE MySchema.Opcao ADD CONSTRAINT [FK_SUPERKEY] FOREIGN KEY ([prova], [aluno], [pergunta], [dataRealizacao])
REFERENCES MySchema.Integra([prova], [aluno], [pergunta], [dataRealizacao]);
Or create a constraint for each column, like this:
ALTER TABLE MySchema.Opcao ADD CONSTRAINT [FK_OPCAO_PROVA] FOREIGN KEY ([prova])
REFERENCES MySchema.Integra([prova]);
ALTER TABLE MySchema.Opcao ADD CONSTRAINT [FK_OPCAO_ALUNO] FOREIGN KEY ([aluno])
REFERENCES MySchema.Integra([aluno]);
ALTER TABLE MySchema.Opcao ADD CONSTRAINT [FK_OPCAO_PERGUNTA] FOREIGN KEY ([pergunta])
REFERENCES MySchema.Integra([pergunta]);
ALTER TABLE MySchema.Opcao ADD CONSTRAINT [FK_OPCAO_DATAREALIZACAO] FOREIGN KEY ([dataRealizacao])
REFERENCES MySchema.Integra([dataRealizacao]);
Would the Unique constraints follow the same logic? How do I know when to do one or the other?
You want to make a foreign key consisting of three columns which have to match all the three columns in the referenced table?
Then you should use in my oppinion on constraint for the three columns, because its the semantic you want to tell.
The one constraint for each column approach has the same effect, but you have to think a little to get the intension.
Some other tips: I don't get the semantic of the schema because i don't know the language the entities are named in. It would be easier if they were named in english. One thing i saw is the pergunta column which is duplicated and needs to be consistent in opcao, Integra und Pergunta table, this may lead to problems.
I generally helped me to always make an artifical auto increment primary key for every table (even the join tables for n to m relations), and always reference this artificial key. Then you have less problems (with case insensitivity for example) and the schema is in my oppinion easier to understand.

Set on demand foreign key in DataBase

I have table in my data base with these specs:
one PK
3 fields with foreign key
some statistic fields
problem is here:
In every row only one FK field will be filled.
What is the best solution A or B?
A- define 3 FK for my table
B- define one field as FK_TYPE and one field as DEMAND_FK and use checking on FK_TYPE for result
Option A - if you've got to have this design, you'll need a separate column for each foreign key. There's no (standard) way to define a "conditional" foreign key.
If your system supports check constraints, include a check constraint so that exactly one of the FK columns is not null. If it doesn't support check constraints, add triggers that enforce this same check.
If I am not wrong, B can not be possible in any relational database. Foreign key can only reference to only one primary key of a table. If you use B then you have to add the constrain in application level. Otherwise use A.

Omitting columns of parent-table when creating Foreign Key

To create a Foreign Key in Oracle, some times I see
CONSTRAINT FK_Supplier
FOREIGN KEY (Supplier_id)
REFERENCES Supplier(Supplier_id)
But, some other times, I see this
CONSTRAINT FK_Supplier
FOREIGN KEY (Supplier_id)
REFERENCES Supplier
The difference is that the column Supplier_id comes after the table Supplier in the first statement but it is omitted in the second statement.
Thanks for helping
This is described in the documentation:
If you identify only the parent table or view and omit the column
name, then the foreign key automatically references the primary key of
the parent table or view. The corresponding column or columns of the
foreign key and the referenced key must match in order and datatype.
One of the major criticisms of SQL as regards not being faithful to the relational model is reliance on column ordering. However, just because SQL includes non-relational feature it does not mean that one should use them; in fact, I feel strongly that such features should be avoided or, when avoidance is impossible, mitigated against.
Standard SQL provides some syntax to avoid column ordering reliance (NATURAL JOIN, UNION CORRESPONDING, etc). Other syntax helps mitigate against such reliance (e.g. INSERT INTO (<comma list of columns>) VALUES (<comma list of fields in same order>)). FOREIGN KEY syntax falls into this second category.
Conclusion: always use the syntax in your first example and avoid the second.

What is the difference between a primary key and a unique constraint?

Someone asked me this question on an interview...
Primary keys can't be null. Unique keys can.
A primary key is a unique field on a table but it is special in the sense that the table considers that row as its key. This means that other tables can use this field to create foreign key relationships to themselves.
A unique constraint simply means that a particular field must be unique.
Primary key can not be null but unique can have only one null value.
Primary key create the cluster index automatically but unique key not.
A table can have only one primary key but unique key more than one.
TL;DR Much can be implied by PRIMARY KEY (uniqueness, reference-able, non-null-ness, clustering, etc) but nothing that can't be stated explicitly using UNIQUE.
I suggest that if you are the kind of coder who likes the convenience of SELECT * FROM... without having to list out all those pesky columns then PRIMARY KEY is just the thing for you.
a relvar can have several keys, but we choose just one for underlining
and call that one the primary key. The choice is arbitrary, so the
concept of primary is not really very important from a logical point
of view. The general concept of key, however, is very important! The
term candidate key means exactly the same as key (i.e., the addition
of candidate has no real significance—it was proposed by Ted Codd
because he regarded each key as a candidate for being nominated as the
primary key)... SQL allows a subset of a table's columns to be
declared as a key for that table. It also allows one of them to be
nominated as the primary key. Specifying a key to be primary makes
for a certain amount of convenience in connection with other
constraints that might be needed
What Is a Key? by Hugh Darwen
it's usual... to single out one key as the primary key (and any other
keys for the relvar in question are then said to be alternate keys).
But whether some key is to be chosen as primary, and if so which one,
are essentially psychological issues, beyond the purview of the
relational model as such. As a matter of good practice, most base
relvars probably should have a primary key—but, to repeat, this rule,
if it is a rule, really isn't a relational issue as such... Strong
recommendation [to SQL users]: For base tables, at any rate, use
PRIMARY KEY and/or UNIQUE specifications to ensure that every such
table does have at least one key.
SQL and Relational Theory: How to Write Accurate SQL Code
By C. J. Date
In standard SQL PRIMARY KEY
implies uniqueness but you can specify that explicitly (using UNIQUE).
implies NOT NULL but you can specify that explicitly when creating columns (but you should be avoiding nulls anyhow!)
allows you to omit its columns in a FOREIGN KEY but you can specify them explicitly.
can be declared for only one key per table but it is not clear why (Codd, who originally proposed the concept, did not impose such a restriction).
In some products PRIMARY KEY implies the table's clustered index but you can specify that explicitly (you may not want the primary key to be the clustered index!)
For some people PRIMARY KEY has purely psychological significance:
they think it signifies that the key will be referenced in a foreign key (this was proposed by Codd but not actually adopted by standard SQL nor SQL vendors).
they think it signifies the sole key of the table (but the failure to enforce other candidate keys leads to loss of data integrity).
they think it implies a 'surrogate' or 'artificial ' key with no significance to the business (but actually imposes unwanted significance on the enterprise by being exposed to users).
Every primary key is a unique constraint, but in addition to the PK, a table can have additional unique constraints.
Say you have a table Employees, PK EmployeeID. You can add a unique constraint on SSN, for example.
Unique Key constraints:
Unique key constraint will provide you a constraint like the column values should retain uniqueness.
It will create non-clustered index by default
Any number of unique constraints can be added to a table.
It will allow null value in the column.
ALTER TABLE table_name
ADD CONSTRAINT UNIQUE_CONSTRAINT
UNIQUE (column_name1, column_name2, ...)
Primary Key:
Primary key will create column data uniqueness in the table.
Primary key will create clustered index by default
Only one Primay key can be created for a table
Multiple columns can be consolidated to form a single primary key
It wont allow null values.
ALTER TABLE table_name
ADD CONSTRAINT KEY_CONSTRAINT
PRIMARY KEY (column_name)
In addition to Andrew's answer, you can only have one primary key per table but you can have many unique constraints.
Primary key's purpose is to uniquely identify a row in a table. Unique constraint ensures that a field's value is unique among the rows in table.
You can have only one primary key per table. You can have more than one unique constraint per table.
A primary key is a minimal set of columns such that any two records with identical values in those columns have identical values in all columns. Note that a primary key can consist of multiple columns.
A uniqueness constraint is exactly what it sounds like.
The UNIQUE constraint uniquely identifies each record in a database table.
The UNIQUE and PRIMARY KEY constraints both provide a guarantee for uniqueness for a column or set of columns.
A PRIMARY KEY constraint automatically has a UNIQUE constraint defined on it.
Note that you can have many UNIQUE constraints per table, but only one PRIMARY KEY constraint per table
Primary key can't be null but unique constraint is nullable.
when you choose a primary key for your table it's atomatically Index that field.
Primary keys are essentially combination of (unique +not null). also when referencing a foreign key the rdbms requires Primary key.
Unique key just imposes uniqueness of the column.A value of field can be NULL in case of uniqe key. Also it cannot be used to reference a foreign key, that is quite obvious as u can have null values in it
Both guarantee uniqueness across the rows in the table, with the exception of nulls as mentioned in some of the other answers.
Additionally, a primary key "comes with" an index, which could be either clustered or non-clustered.
There are several good answers in here so far. In addition to the fact that a primary key can't be null, is a unique constraint itself, and can be comprised of multiple columns, there are deeper meanings that depend on the database server you are using.
I am a SQL Server user, so a primary key has a more specific meaning to me. In SQL Server, by default, primary keys are also part of what is called the "clustered index". A clustered index defines the actual ordering of data pages for that particular table, which means that the primary key ordering matches the physical order of rows on disk.
I know that one, possibly more, of MySql's table formats also support clustered indexing, which means the same thing as it does in SQL Server...it defines the physical row ordering on disk.
Oracle provides something called Index Organized Tables, which order the rows on disk by the primary key.
I am not very familiar with DB2, however I think a clustered index means the rows on disk are stored in the same order as a separate index. I don't know if the clustered index must match the primary key, or if it can be a distinct index.
A great number of the answers here have discussed the properties of PK vs unique constraints. But it is more important to understand the difference in concept.
A primary key is considered to be an identifier of the record in the database. So these will for example be referenced when creating foreign key references between tables. A primary key should therefore under normal circumstances never contain values that have any meaining in your domain (often automatically incremential fields are used for this).
A unique constraint is just a way of enforcing domain specific business rules in your database schema.
Becuase a PK it is an identifier for the record, you can never change the value of a primary key.