I would like to design a table named arguments whose an attribute name is linked to another attribute name in a table called names.
I see two ways to express it in SQL:
by creating a constraint on the table:
CREATE TABLE names ( name text UNIQUE,
short text UNIQUE,
comment text);
CREATE TABLE arguments ( name text UNIQUE,
comment text,
FOREIGN KEY (name) REFERENCES names (name));
by qualifying the attribute on-the-fly:
CREATE TABLE names ( name text UNIQUE,
short text UNIQUE,
comment text);
CREATE TABLE arguments ( name text UNIQUE REFERENCES names (name),
comment text);
I would like to know:
if one of the two is commonly known as better than the other, and
if it can have consequences that I should be aware of.
Thank you for your help.
These are just different syntax for the same end result.
Either is appropriate, but the former is a more common style in my experience. This may simply be to allow the human mind to more easily digest all the information. First describe the data, second describe how it relates to the rest of the world.
One comment I would make though, is that is more common to have IDs as unique identifiers and references. This allows you to change the Value in the Name field without changing it's Identity and breaking Referential Integrity. There are databases that can cascade such changes and update all occurrences of the Name, but in general it's considered cleaner to have Identifiers that are Separate from the Data.
While the first option is known as out-of-line constraint declaration and the second option is in-line, both of them are functionally same.
What would be better is to assign a name to the foreign key constraint. If you have a name, you can selectively enable and disable the constraint if required.
Create table
CREATE TABLE arguments
(
name text UNIQUE,
comment text,
constraint arguments_fk FOREIGN KEY (name) REFERENCES names (name)
);
Disable constraint
ALTER TABLE arguments NOCHECK CONSTRAINT arguments_fk;
Enable constraint
ALTER TABLE arguments CHECK CONSTRAINT arguments_fk;
This is for SQL Server. Oracle has equivalent commands.
Use a foreign key. If you later find you have a performance problem (a measurable problem), then you can change it up to be different.
Keep things simple at first and get the product into the users' hands as fast as possible. Don't optimize things that you can't prove need it.
Related
Let's assume I have a table called boxes with the box_id attribute as the PK.
There are two other tables. The first one is red_boxes and the second blue_boxes.
I have added a constraint to the red_boxes table
ALTER TABLE red_boxes
ADD CONSTRAINT fk_box_id
FOREIGN KEY (box_id)
REFERENCES boxes (box_id);
Now, I would like to add a constraint to the blue_boxes table. The SQL structure would look like the following, if I did not add the constraint already to the the red_boxes. The obvious way to fix this is to name a new constraint differently e.g. fk_box_id2, but is this is a good way? Am I supposed to somehow re-use the previous constraint, or this is not possible, why?
ALTER TABLE blue_boxes
ADD CONSTRAINT fk_box_id
FOREIGN KEY (box_id)
REFERENCES boxes (box_id)
Each constraint is separate and requires a unique name. My recommendation is to use the source and destination table names, for example fk_red_boxes_boxes and fk_blue_boxes_boxes. This way you can easily identify where they come from and where they go to.
If you have underscores in your table names, you might want to come up with a modified convention that you can easily understand at a glance. For example, a double underscore: fk__blue_boxes__boxes and fk__red_boxes__boxes.
I have a PostgreSQL 9.3 database with a users table that stores usernames in their case-preserved format. All queries will be case insensitive, so I should have an index that supports that. Additionally, usernames must be unique, regardless of case.
This is what I have come up with:
forum=> \d users
Table "public.users"
Column | Type | Modifiers
------------+--------------------------+------------------------
name | character varying(24) | not null
Indexes:
"users_lower_idx" UNIQUE, btree (lower(name::text))
Expressed in standard SQL syntax:
CREATE TABLE users (
name varchar(24) NOT NULL
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));
With this schema, I've satisfied all my constraints, albeit without a primary key. The SQL standard doesn't support functional primary keys, so I cannot promote the index:
forum=> ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_idx;
ERROR: index "users_lower_idx" contains expressions
LINE 1: ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_id...
^
DETAIL: Cannot create a primary key or unique constraint using such an index.
But, I already have the UNIQUE constraint, and the column is already marked "NOT NULL." If I had to have a primary key, I could construct the table like this:
CREATE TABLE users (
name varchar(24) PRIMARY KEY
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));
But then I'll have two indexes, and that seems wasteful and unnecessary to me. So, does PRIMARY KEY mean anything special to postgres beyond "UNIQUE NOT NULL," and am I missing anything by not having one?
First off, practically every table should have a primary key.
citext
The additional module provides a data type of the same name. "ci" for case insensitive. Per documentation:
The citext module provides a case-insensitive character string type,
citext. Essentially, it internally calls lower when comparing
values. Otherwise, it behaves almost exactly like text.
It is intended for exactly the purpose you describe:
The citext data type allows you to eliminate calls to lower in SQL
queries, and allows a primary key to be case-insensitive.
Bold emphasis mine.
Be sure to read the manual about limitations first. Install it once per database with
CREATE EXTENSION citext;
text
If you don't want to go that route, I suggest you add a serial as surrogate primary key.
CREATE TABLE users (
user_id serial PRIMARY KEY
, username text NOT NULL
);
I would use text instead of varchar(24). Use a CHECK constraint if you need to enforce a maximum length (that may change at a later time). Details:
Any downsides of using data type "text" for storing strings?
Change PostgreSQL columns used in views
Along with the UNIQUE index in your original design (without type cast):
CREATE UNIQUE INDEX users_username_lower_idx ON users (lower(username));
The underlying integer of a serial is small and fast and does not have to waste time with lower() or the collation of your database. That's particularly useful for foreign key references. I mostly prefer that over some natural primary key with varying properties.
Both solutions have pros and cons.
I would suggest using a primary key, as you have stated you want something that is unique, and as you have demonstrated that you can put unique constraints on a username. I will assume that since this is a unique,not null username that you will use this to track your users in other parts of the Database, as well as allow usernames to be changed.
This is where a primary key will come in handy, instead of having to go into all of your tables and change the value of the Username column, you will only have one place to change it.
Example
Without primary key:
Table users
Username
'Test'
Table thingsdonebyUsers
RandomColumn AnotherColumn Username
RandomValue RandomValue Test
Now assume your user wants to change his username to Test1, well now you have to go find everywhere you used Username and change that to the new value before you change it in your users table since I'm assuming you will have a constraint there.
With Primary Key
Table users
PK Username
1 'Test'
Table thingsdonebyUsers
RandomColumn AnotherColumn PK_Users
RandomValue RandomValue 1
Now you can just change your users table and be done with the change.
You can still enforce unique and not null on your username column as you demonstrated.
This is just one of the many advantages of having normalized tables, which requires your tables to have a Primary Key that is an unrelated value(forget what the proper name is for this right now).
As for what a PK actually signifies, it just a non nullable unique column that identifies the row, so in this sense you already have a Primary Key on your table. The thing is that usually PKs are INT numbers because of the reason that I explained above.
Short answer: No, you don't need a declarative "PRIMARY KEY", since the UNIQUE index serves the same exact purpose.
Long answer:
The idea of having Primary Keys comes from database systems where the data is physically in key order. This requires having a single, "primary" key. MySQL InnoDB is this way, as are many older databases.
However, PostgreSQL does not keep the tables in key order; it separates the indexes, including the primary key index, from the heap, which is essentially unordered. As a result, in Postgres, there is no material difference between primary keys and unique indexes. You can even create a foreign key against a unique index, as long as that index covers the whole table.
That being said, some tools external to PostgreSQL look for primary keys and do not regard unique indexes as being equivalent. These tools may cause you issues because of not finding a PK.
I am trying to enforce a CHECK Constraint in a ORACLE Database on multiple tables
CREATE TABLE RollingStocks (
Id NUMBER,
Name Varchar2(80) NOT NULL,
RollingStockCategoryId NUMBER NOT NULL,
CONSTRAINT Pk_RollingStocks Primary Key (Id),
CONSTRAINT Check_RollingStocks_CategoryId
CHECK ((RollingStockCategoryId IN (SELECT Id FROM FreightWagonTypes))
OR
(RollingStockCategoryId IN (SELECT Id FROM LocomotiveClasses)))
);
...but i get the following error:
*Cause: Subquery is not allowed here in the statement.
*Action: Remove the subquery from the statement.
Can you help me understanding what is the problem or how to achieve the same result?
Check constraints are very limited in Oracle. To do a check like you propose, you'd have to implement a PL/SQL trigger.
My advise would be to avoid triggers altogether. Implement a stored procedure that modifies the database and includes the checks. Stored procedures are easier to maintain, although they are slightly harder to implement. But changing a front end from direct table access to stored procedure access pays back many times in the long run.
What you are trying to is ensure that the values inserted in one table exist in another table i.e. enforce a foreign key. So that would be :
CREATE TABLE RollingStocks (
...
CONSTRAINT Pk_RollingStocks Primary Key (Id),
CONSTRAINT RollingStocks_CategoryId_FK (RollingStockCategoryId )
REFERENCES FreightWagonTypes (ID)
);
Except that you want to enforce a foreign key which references two tables. This cannot be done.
You have a couple of options. One would be to merge FreightWagonTypes and LocomotiveClasses into a single table. If you need separate tables for other parts of your application then you could build a materialized view for the purposes of enforcing the foreign key. Materialized Views are like tables and can be referenced by foreign keys. This option won't work if the key values for the two tables clash.
Another option is to recognise that the presence of two candidate referenced tables suggests that RollingStock maybe needs to be split into two tables - or perhaps three: a super type and two sub-type tables, that is RollingStock and FreightWagons, Locomotives.
By the way, what about PassengerCoaches, GuardsWagons and RestaurantCars?
Oracle doesn't support complex check constraints like that, unfortunately.
In this case, your best option is to change the data model a bit - add a parent table over FreightWagonTypes and LocomotiveClasses, which will hold all the ids from both of these tables. That way you can add a FK to a single table.
i am quite confused about the difference between a FOREIGN KEY and CHECK constraint - they appear to me to achieve the same result.
I mean I could create a table and enforce a Foreign key on another table, but i could create a CHECK to ensure the value in in another table.
What is the difference and when to use the one or the other?
A FOREIGN KEY constrain ensures that the entry DOES EXISTS in
EDIT
another table
as per correct comment Exists in another table... or the same table. – Mark Byers
A CHECK constrain ensures that the entry follows some rule.
CHECK Constraints
CHECK constraints enforce domain integrity by limiting the values that are accepted by a column. They are similar to FOREIGN KEY constraints in that they control the values that are put in a column. The difference is in how they determine which values are valid: FOREIGN KEY constraints obtain the list of valid values from another table, and CHECK constraints determine the valid values from a logical expression that is not based on data in another column.
A foreign key constraint is more powerful than a CHECK constraint.
A foreign key constraint means that the column (in the current table) can only have values that already exist in the column of the foreign table (which can include the be the same table, often done for hierarchical data). This means that as the list of values changes - gets bigger or smaller - there's no need to update the constraint.
A check constraint can not reference any columns outside of the current table, and can not contain a subquery. Often, the values are hard coded like BETWEEN 100 and 999 or IN (1, 2, 3). This means that as things change, you'll have to update the CHECK constraint every time. Also, a foreign key relationship is visible on an Entity Relationship Diagram (ERD), while a CHECK constraint will never be. The benefit is that someone can read the ERD and construct a query from it without using numerous DESC table commands to know what columns are where and what relates to what to construct proper joins.
Best practice is to use foreign keys (and supporting tables) first. Use CHECK constraints as a backup for situations where you can't use a foreign key, not as the primary solution to validate data.
It depends on your DBMS (which you didn't specify), but in one sense, you are correct: a foreign key constraint is a particular case of a check constraint. There are DBMS which would not allow you to formulate a foreign key constraint as a check constraint.
The main intention of a check constraint is to describe conditions that apply to a single row in the table. For example, I have a table of elements (as in Hydrogen, Helium, ...) and the symbols for the elements are constrained to start with an upper-case letter and are followed by zero, one or two lower-case letters (two lower-case letters for as yet undiscovered but predicted elements: Uus - ununseptium (117), which has just been isolated but has yet to be named). This can be the subject of a CHECK constraint:
CHECK(Symbol MATCHES "[A-Z][a-z]{0,2}")
assuming MATCHES exists and supports an appropriate regular expression language.
You can also have check constraints that compare values:
CHECK(OrderDate <= ShipDate OR ShipDate IS NULL)
To express a foreign key constraint as a check constraint, you have to be permitted to execute a query in the CHECK clause. Hypothetically:
CHECK(EXISTS(SELECT * FROM SomeTable AS s
WHERE ThisTable.pk_col1 = s.pk_col1 AND
ThisTable.pk_col2 = s.pk_col2))
This example shows some of the problems. I don't have a convenient table alias for the table in which I'm writing the check constraint - I assumed it was 'ThisTable'. The construct is verbose. Assuming that the primary key on SomeTable is declared on columns pk_col1 and pk_col2, then the FOREIGN KEY clause is much more compact:
FOREIGN KEY (pk_col1, pk_col2) REFERENCES SomeTable
Or, if you are referencing an alternative key, not the primary key:
FOREIGN KEY (pk_col1, pk_col2) REFERENCES SomeTable(ak_col1, ak_col2)
This is notationally more compact - so there is less chance of getting it wrong - and can be special-cased by the server because the special notation means it knows that it is dealing with a foreign key constraint whereas the general check clause has to be scrutinized to see if it matches one of many possible forms that are equivalent to the foreign key.
The question asks: when to use a check constraint and when to use a foreign key constraint?
Use a CHECK constraint to specify criteria that can be checked in a single row.
Use a FOREIGN KEY constraint to specify that the values in the current row must match the values of a row in some other unique key (a candidate key, usually the primary key rather than an alternative key) of some table - which may be the same table or (more usually) a different table.
Consider a scenario like this:
Table A has a keyword column, and the value must be among thousand of keywords provided.
How would you like to implement the constraint?
Hard coded check condition like check (keyword in ('a', 'b', 'c' .......)) or simply import the provided keywords as another table and set a foreign key constraint to keyword column of Table A.
is there anyway to create lets say pattern for primary key i.e. for table products such pattern would by p-1,p-2... p-n etc.
Thanks
Well, you can manually create and enforce that pattern into your application (or using triggers). A primary key just needs to be unique to work.
But I don't recommend it. In your sample, seems P-1 have a business meaning. And, if it belongs to your business realm, it can be changed. While most database have a UPDATE CASCADE equivalent, it doesn't change basic reason you shouldn't use that as key: it's information, not data.
I suggest you to create a field named ProductCode char(10) NOT NULL UNIQUE and maybe to fill it with P-00000001, P-00000002, and so on. Maybe you do prefer to use a varchar: this doesn't matter, as it must fulfill your business requirement. Create an Id INTEGER AUTO_INCREMENT PRIMARY KEY field to use as primary key, as it doesn't never needs to be changed.