I want to know the advantages of using the identity column on Unique Key. I have always used the identity column, and know it is better than Unique Key. But I want to know the benefits of Identity key over unique key.
Thanks
Are you trying to decide between using an identity column vs. a uniqueidentifier column as a primary key?
Uniqueidentifier (Guids) have the advantage of being unique across the entire environment, making it easier to uniquely identify a record outside of the context of a given table. They also provide a key you can expose in your applications which are not easily iterated.
Identity columns use less storage and are therefore faster and more efficient.
There is no right answer for their use. It really depends on the context.
A primary key can be used as a foreign key in another table, an identity column cannot. Without seeing the type of data you are trying to store, I would recommend using an identity column but also setting that as your primary key.
Related
Having trouble finding anything definite about whether it's needed to specify that an IDENTITY column as PRIMARY KEY in Oracle 12.2c. Does an IDENTITY column automatically create an index, like a PK? Is it just being redudant? I do believe you can have an IDENTITY column and separate PK, though we are not doing that.
ID NUMBER AS IDENTITY PRIMARY KEY == ID NUMBER AS IDENTITY ?
Does an IDENTITY column automatically create an index, like a PK?
No. An identity column is just a column auto-populated with a sequentially generated number. You can use it however you want, but the typical use is as a synthetic primary key.
Is it just being redundant?
No.
I do believe you can have an IDENTITY column and separate PK
Yes, you can.
though we are not doing that.
Fine, if you mean you are not having a separate PK column in addition to the identity column. Defining a PK constraint over the identity column would be a good idea.
It's a common mistake to mix logical and physical organization of data.
You successfully mixed 3 orthogonal concepts:
logical: PRIMARY KEY constraint
physical: INDEX
automatic value generation: IDENTITY column
Does an IDENTITY column automatically create an index, like a PK? Is it just being redudant?
Those questions are very version dependent. IDENTITY itself was introduced in Oracle 12.x.
I do believe you can have an IDENTITY column and separate PK, though we are not doing that.
You are correct here.
Auto value generation, logical constraint and physical data organization are orthogonal to each other.
An IDENTITY column can be and often is useful as primary key, but it doesn't have to be.
The identity column is very useful for the surrogate primary key column. When you insert a new row into the identity column, Oracle auto-generates and insert a sequential value into the column.
https://www.oracletutorial.com/oracle-basics/oracle-identity-column/
What is its use, when both identifies the unique row?
Why people are using identity column as a primary key ?
Can anyone briefly describe the answer ?
A primary key is a logical concept - it is the means by which you will uniquely identify each record in a table. There are several types of primary key - a natural key uses a data attribute from the business domain which is guaranteed to have the requirements for a primary key (unique, not null, immutability) such as a social security number, a compound key is a key made up of multiple columns (often used in "parent-child" relationships), and a surrogate key is created by the system; it could be an auto-increment, or identity column.
Identity is a data type. It is very useful for use as a surrogate primary key, because it has all the attributes required. It's unlikely you'd use the identity type for purposes other than as a primary key, but there's nothing to stop you from doing so.
So, not all primary keys use the identity data type, and not all identity columns are primary keys.
Primary key is a kind of unique key. It's, in fact, a restriction (constraint) that values for a specific column (or, in general case, set of columns) cannot be the same in different rows (even when manually/explicitly setting to same values with an insert/update).
Primary/unique key isn't required to be an auto-incremented. It's, in fact, isn't required to be integer at all — it can be text or other type.
Primary key is a bit stricter than usual unique key in that it usually implies NOT NULL and, in additional to that, only one primary key is allowed per table (while several unique keys per table are allowed in addition to primary key).
Creating primary/unique key usually implicitly creates an index to make search and constraint-checking by that column(s) faster.
E.g. if column my_column of my_table is marked as primary or unique key, you can't do this:
INSERT INTO my_table (my_column, other_column, third_column)
VALUES (10, …, …);
INSERT INTO my_table (my_column, other_column, third_column)
VALUES (10, …, …); -- the same value for my_column again
Identity in your RDBMS is what other RDBMSes may call auto_increment or serial. It's just a feature that during an row-insert operation a specific column, when not being explicitly set to some value, is automatically initialized to (most often) consecutive integer values.
E.g. if column my_column of my_table is marked as auto_increment/serial/identity, you can do this:
INSERT INTO my_table (other_column, third_column) VALUES (…, …);
-- not specifying any value for my_column manually,
-- it'll be initialized automatically to some value
-- (usually an increasing integer sequence)
Auto_increment/serial/identity usually doesn't guarantee strict consequentiality of automatic values (especially in case of aborted transactions).
Concretely documentation for TRANSACT-SQL says that identity doesn't guarantee:
uniqueness (use unique/primary keys to enforce that);
strict consequentiality.
Update: As a_horse_with_no_name suggested, "identity" appears to be not only a name of the common auto_increment/serial/identity feature within specific RDBMSes (e.g. Microsoft SQL Server), but also a name defined by ANSI SQL standard.
AFAIK, it doesn't differ very much from what I described above (about the common auto_increment/serial/identity feature in implementations). I mean that it makes column values to be automatically initialized with an integer sequence, but doesn't guarantee uniqueness and strict consequentiality.
Still, I suppose that, unlike auto_increment/serial columns in MySQL/PostgreSQL, an ANSI-SQL-standard generated always as identity column doesn't allow its values to be set manually in INSERT or UPDATE (only automatically).
In a database table, every row should be unique, and you need to be able to identify a particular row uniquely.
A given table may have one or more columns which have unique values, so any of these columns can do the job. There may also be two or more columns which, while not unique of themselves, form a unique combination. That will also do.
Any column, or combination of columns, which can uniquely identify a row is called a candidate key. In principle, you can choose any key that you like, but you need to ensure that uniqueness is enduring. For example, in a small table of persons, the given name may be unique, but you run the risk of blowing that with the next additional person.
A primary key is the candidate key you nominate as your preferred
key. For example, a table of persons may have a number of unique attributes such as email address, mobile phone number and others. The primary key is the attribute you choose in preference to the others.
The following is not strictly required, but is good practice for a good Primary Key:
A Primary Key shouldn’t change
A Primary Key shouldn’t be recycled
For this reason, a Primary Key shouldn’t have any real meaning, so there should never be a reason to change or reuse it. As a result, the primary key is often an arbitrary code whose only real meaning is that it identifies the row. If the key is purely used for identification and has no other meaning, it is often referred to as a Surrogate Key.
You can put some effort into generating arbitrary codes. Sometimes they follow complex patterns which can be used to check their validity.
If you want to take a lazier approach, you can use a sequence number. Contrary to my previous advice, though, it does sort of have a meaning: it is strictly sequential, so you can learn which row was added after another, but not exactly when. However, that fact won’t change — it will not change in value, and will not be reused — so it’s still pretty stable.
An identity column is, in fact, a sequence number. It is auto-generated, and very useful if you want an arbitrary code for your primary key. Unfortunately it is relatively late to the very slow moving standards, so every DBMS had its own non-standard variation:
MySQL calls it AUTO_INCREMENT
SQLite calls it AUTOINCREMENT
MSSQL calls it IDENTITY()
MSACCESS calls it AUTONUMBER
PostgreSQL calls it SERIAL
and each has its own quirks.
More recently, (2003, I believe) it has been added to the standards in the form of:
int generated by default as identity
but this has only just started to appear in PostgreSQL and Oracle. This use of IDENTITY behaves differently to Microsoft’s.
primary key required unique number and primary key value can not be null that can get by identity and we not need to manually add at each new record which is added in table.
when record is failed to insert in table for some reason that time also identity is increase in sql server.
I am working on a voting table design using Postgres 9.5 (but maybe the question itself is applicable to sql in general). My vote table should be like:
-------------------------
object | user | timestamp
-------------------------
Where object and user are foreign keys to the ids corresponding to their own tables. I have a problem identifying what actually should be a primary key.
I thought at first to make a primary_key(object, user) but since I use django as a server, it just doesn't support multicolumn primary key, I am not sure either about the performance since I may access a row using only one of those 2 columns (i.e. object or user), but the advantage this idea works automatically as a unique key since the same user shouldn't vote twice for the same object. And I don't need any additional indexes.
The other idea is to introduce an auto or serial id field, I really don't think of any advantage of using this approach especially when the table gets bigger. I need also to introduce at least a unique_key(object, user) which adds to the computational complexity and data storage. Not even sure about the performance when I select using one of the 2 columns, may be I need also 2 additional indexes for the object and user to accelerate the select operation since I need this heavily.
Is there something I am missing here? or is there a better idea?
django themselves recognise that the "natural primary key" in this case is not supported. So your gut feeling is right, but django don't support it.
https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys
Relational database designs use a set of columns as the primary key
for a table. When this set includes more than one column, it is known
as a “composite” or “compound” primary key. (For more on the
terminology, here is an article discussing database keys).
Currently Django models only support a single column in this set,
denying many designs where the natural primary key of a table is
multiple columns. Django currently can't work with these schemas; they
must instead introduce a redundant single-column key (a “surrogate”
key), forcing applications to make arbitrary and otherwise-unnecessary
choices about which key to use for the table in any given instance.
I'm less failure with django personally. One option might be to form an extra column as a primary key by concatenating object and user.
Remember that there is nothing special about a primary key. You can always add a UNIQUE KEY on the pair of columns and make them both NOT NULL.
You might find this example useful.
https://thecuriousfrequency.wordpress.com/2014/11/11/make-primary-key-with-two-or-more-field-in-django/
The correct solution woulf be to have a PRIMARY KEY (object, user) and an additional index on user. The primary key index can also be used for searches for object alone.
Form a database point of view, your problem is that you use an inadequate middleware if it does not support composite primary keys.
You'll probably have to introduce an artificial primary key constraint and in addition have a unique constraint on (object, user) and an index on user, but your gut feelings that that is not the best solution from a database perspective are absolutely true.
I googled a lot, but I did not find the exact straight forward answer with an example.
Any example for this would be more helpful.
The primary key is a unique key in your table that you choose that best uniquely identifies a record in the table. All tables should have a primary key, because if you ever need to update or delete a record you need to know how to uniquely identify it.
A surrogate key is an artificially generated key. They're useful when your records essentially have no natural key (such as a Person table, since it's possible for two people born on the same date to have the same name, or records in a log, since it's possible for two events to happen such they they carry the same timestamp). Most often you'll see these implemented as integers in an automatically incrementing field, or as GUIDs that are generated automatically for each record. ID numbers are almost always surrogate keys.
Unlike primary keys, not all tables need surrogate keys, however. If you have a table that lists the states in America, you don't really need an ID number for them. You could use the state abbreviation as a primary key code.
The main advantage of the surrogate key is that they're easy to guarantee as unique. The main disadvantage is that they don't have any meaning. There's no meaning that "28" is Wisconsin, for example, but when you see 'WI' in the State column of your Address table, you know what state you're talking about without needing to look up which state is which in your State table.
A surrogate key is a made up value with the sole purpose of uniquely identifying a row. Usually, this is represented by an auto incrementing ID.
Example code:
CREATE TABLE Example
(
SurrogateKey INT IDENTITY(1,1) -- A surrogate key that increments automatically
)
A primary key is the identifying column or set of columns of a table. Can be surrogate key or any other unique combination of columns (for example a compound key). MUST be unique for any row and cannot be NULL.
Example code:
CREATE TABLE Example
(
PrimaryKey INT PRIMARY KEY -- A primary key is just an unique identifier
)
All keys are identifiers used as surrogates for the things they identify. E.F.Codd explained the concept of system-assigned surrogates as follows [1]:
Database users may cause the system to generate or delete a surrogate,
but they have no control over its value, nor is its value ever
displayed to them.
This is what is commonly referred to as a surrogate key. The definition is immediately problematic however because Codd was assuming that such a feature would be provided by the DBMS. DBMSs in general have no such feature. The keys are normally visible to at least some DBMS users as, for obvious reasons, they have to be. The concept of a surrogate has therefore morphed slightly in usage. The term is generally used in the data management profession to mean a key that is not exposed and used as an identifier in the business domain. Note that this is essentially unrelated to how the key is generated or how "artificial" it is perceived to be. All keys consist of symbols invented by humans or machines. The only possible significance of the term surrogate therefore relates how the key is used, not how it is created or what its values are.
[1] Extending the database relational model to capture more meaning, E.F.Codd, 1979
This is a great treatment describing the various kinds of keys:
http://www.agiledata.org/essays/keys.html
A surrogate key is typically a numeric value. Within SQL Server, Microsoft allows you to define a column with an identity property to help generate surrogate key values.
The PRIMARY KEY constraint uniquely identifies each record in a database table.
Primary keys must contain UNIQUE values.
A primary key column cannot contain NULL values.
Most tables should have a primary key, and each table can have only ONE primary key.
http://www.databasejournal.com/features/mssql/article.php/3922066/SQL-Server-Natural-Key-Verses-Surrogate-Key.htm
I think Michelle Poolet describes it in a very clear way:
A surrogate key is an artificially produced value, most often a
system-managed, incrementing counter whose values can range from 1 to
n, where n represents a table's maximum number of rows. In SQL Server,
you create a surrogate key by assigning an identity property to a
column that has a number data type.
http://sqlmag.com/business-intelligence/surrogate-key-vs-natural-key
It usually helps you use a surrogate key when you change a composite key with an identity column.
I am using GUIDs as my primary key for all my other tables, but I have a requirement that needs to have an incrementing number. I tried to create a field in the table with the auto increment but MySql complained that it needed to be the primary key.
My application uses MySql 5, nhibernate as the ORM.
Possible solutions I have thought of are:
change the primary key to the auto-increment field but still have the Id as a GUID so the rest of my app is consistent.
create a composite key with both the GUID and the auto-increment field.
My thoughts at the moment are leaning towards the composite key idea.
EDIT: The Row ID (Primary Key) is the GUID currently. I would like to add an an INT Field that is Auto Incremented so that it is human readable. I just didn't want to move away from current standard in the app of having GUID's as primary-keys.
A GUID value is intended to be unique across tables and even databases so, make the auto_increment column primary index and make a UNIQUE index for the GUID
I would lean the other way.
Why? Because creating a composite key gives the impression to the next guy who comes along that it's OK to have the same GUID in the table twice but with different sequence numbers.
A couple of thoughts:
If your GUID is auntoincremental and unique, why not let it be the actual Primary Key?
On the other hand, you should never take semantical decisions based on programmatic problems: you have a problem with MySQL, not with the design of your DB.
So, a couple of workarounds here:
Creating a trigger that would set the GUID to the proper value once it's inserted. That's a MySQL solution to a MySQL problem, without altering semantics for your schema.
Before inserting, start a transaction (make sure auto commit is set to false), find out the latest GUID, increment and insert with the new value. In other words, auto-increment not automatically :P
GUID's are not intended to be orderable, that's why AUTO_INCREMENT for them does not make sense.
You may, though, use an AUTO_INCREMENT for a second column of a composite primary key in MyISAM tables. You can create a composite key over (GUID, INT) column and make the second column to be AUTO_INCREMENT.
To generate a new GUID, just call UUID() in an INSERT statement or in a trigger.
No, only the primary key can have auto_increment as its value.
If, for some reason, you can't change the identity column to be a primary key, what about manually generating the auto-increment via some kind of SEQUENCE table plus a trigger to query the SEQUENCE table and save the next value to use. Then assign the value to the destination table in the trigger. Same effect. The only question I would have is whether the auto-incremented value is going to make it back thru NHibernate without a re-select of the table.