Oracle IDENTITY column versus PRIMARY KEY - sql

Having trouble finding anything definite about whether it's needed to specify that an IDENTITY column as PRIMARY KEY in Oracle 12.2c. Does an IDENTITY column automatically create an index, like a PK? Is it just being redudant? I do believe you can have an IDENTITY column and separate PK, though we are not doing that.
ID NUMBER AS IDENTITY PRIMARY KEY == ID NUMBER AS IDENTITY ?

Does an IDENTITY column automatically create an index, like a PK?
No. An identity column is just a column auto-populated with a sequentially generated number. You can use it however you want, but the typical use is as a synthetic primary key.
Is it just being redundant?
No.
I do believe you can have an IDENTITY column and separate PK
Yes, you can.
though we are not doing that.
Fine, if you mean you are not having a separate PK column in addition to the identity column. Defining a PK constraint over the identity column would be a good idea.

It's a common mistake to mix logical and physical organization of data.
You successfully mixed 3 orthogonal concepts:
logical: PRIMARY KEY constraint
physical: INDEX
automatic value generation: IDENTITY column
Does an IDENTITY column automatically create an index, like a PK? Is it just being redudant?
Those questions are very version dependent. IDENTITY itself was introduced in Oracle 12.x.
I do believe you can have an IDENTITY column and separate PK, though we are not doing that.
You are correct here.
Auto value generation, logical constraint and physical data organization are orthogonal to each other.

An IDENTITY column can be and often is useful as primary key, but it doesn't have to be.
The identity column is very useful for the surrogate primary key column. When you insert a new row into the identity column, Oracle auto-generates and insert a sequential value into the column.
https://www.oracletutorial.com/oracle-basics/oracle-identity-column/

Related

What is difference between primary key and identity?

What is its use, when both identifies the unique row?
Why people are using identity column as a primary key ?
Can anyone briefly describe the answer ?
A primary key is a logical concept - it is the means by which you will uniquely identify each record in a table. There are several types of primary key - a natural key uses a data attribute from the business domain which is guaranteed to have the requirements for a primary key (unique, not null, immutability) such as a social security number, a compound key is a key made up of multiple columns (often used in "parent-child" relationships), and a surrogate key is created by the system; it could be an auto-increment, or identity column.
Identity is a data type. It is very useful for use as a surrogate primary key, because it has all the attributes required. It's unlikely you'd use the identity type for purposes other than as a primary key, but there's nothing to stop you from doing so.
So, not all primary keys use the identity data type, and not all identity columns are primary keys.
Primary key is a kind of unique key. It's, in fact, a restriction (constraint) that values for a specific column (or, in general case, set of columns) cannot be the same in different rows (even when manually/explicitly setting to same values with an insert/update).
Primary/unique key isn't required to be an auto-incremented. It's, in fact, isn't required to be integer at all — it can be text or other type.
Primary key is a bit stricter than usual unique key in that it usually implies NOT NULL and, in additional to that, only one primary key is allowed per table (while several unique keys per table are allowed in addition to primary key).
Creating primary/unique key usually implicitly creates an index to make search and constraint-checking by that column(s) faster.
E.g. if column my_column of my_table is marked as primary or unique key, you can't do this:
INSERT INTO my_table (my_column, other_column, third_column)
VALUES (10, …, …);
INSERT INTO my_table (my_column, other_column, third_column)
VALUES (10, …, …); -- the same value for my_column again
Identity in your RDBMS is what other RDBMSes may call auto_increment or serial. It's just a feature that during an row-insert operation a specific column, when not being explicitly set to some value, is automatically initialized to (most often) consecutive integer values.
E.g. if column my_column of my_table is marked as auto_increment/serial/identity, you can do this:
INSERT INTO my_table (other_column, third_column) VALUES (…, …);
-- not specifying any value for my_column manually,
-- it'll be initialized automatically to some value
-- (usually an increasing integer sequence)
Auto_increment/serial/identity usually doesn't guarantee strict consequentiality of automatic values (especially in case of aborted transactions).
Concretely documentation for TRANSACT-SQL says that identity doesn't guarantee:
uniqueness (use unique/primary keys to enforce that);
strict consequentiality.
Update: As a_horse_with_no_name suggested, "identity" appears to be not only a name of the common auto_increment/serial/identity feature within specific RDBMSes (e.g. Microsoft SQL Server), but also a name defined by ANSI SQL standard.
AFAIK, it doesn't differ very much from what I described above (about the common auto_increment/serial/identity feature in implementations). I mean that it makes column values to be automatically initialized with an integer sequence, but doesn't guarantee uniqueness and strict consequentiality.
Still, I suppose that, unlike auto_increment/serial columns in MySQL/PostgreSQL, an ANSI-SQL-standard generated always as identity column doesn't allow its values to be set manually in INSERT or UPDATE (only automatically).
In a database table, every row should be unique, and you need to be able to identify a particular row uniquely.
A given table may have one or more columns which have unique values, so any of these columns can do the job. There may also be two or more columns which, while not unique of themselves, form a unique combination. That will also do.
Any column, or combination of columns, which can uniquely identify a row is called a candidate key. In principle, you can choose any key that you like, but you need to ensure that uniqueness is enduring. For example, in a small table of persons, the given name may be unique, but you run the risk of blowing that with the next additional person.
A primary key is the candidate key you nominate as your preferred
key. For example, a table of persons may have a number of unique attributes such as email address, mobile phone number and others. The primary key is the attribute you choose in preference to the others.
The following is not strictly required, but is good practice for a good Primary Key:
A Primary Key shouldn’t change
A Primary Key shouldn’t be recycled
For this reason, a Primary Key shouldn’t have any real meaning, so there should never be a reason to change or reuse it. As a result, the primary key is often an arbitrary code whose only real meaning is that it identifies the row. If the key is purely used for identification and has no other meaning, it is often referred to as a Surrogate Key.
You can put some effort into generating arbitrary codes. Sometimes they follow complex patterns which can be used to check their validity.
If you want to take a lazier approach, you can use a sequence number. Contrary to my previous advice, though, it does sort of have a meaning: it is strictly sequential, so you can learn which row was added after another, but not exactly when. However, that fact won’t change — it will not change in value, and will not be reused — so it’s still pretty stable.
An identity column is, in fact, a sequence number. It is auto-generated, and very useful if you want an arbitrary code for your primary key. Unfortunately it is relatively late to the very slow moving standards, so every DBMS had its own non-standard variation:
MySQL calls it AUTO_INCREMENT
SQLite calls it AUTOINCREMENT
MSSQL calls it IDENTITY()
MSACCESS calls it AUTONUMBER
PostgreSQL calls it SERIAL
and each has its own quirks.
More recently, (2003, I believe) it has been added to the standards in the form of:
int generated by default as identity
but this has only just started to appear in PostgreSQL and Oracle. This use of IDENTITY behaves differently to Microsoft’s.
primary key required unique number and primary key value can not be null that can get by identity and we not need to manually add at each new record which is added in table.
when record is failed to insert in table for some reason that time also identity is increase in sql server.

What does PRIMARY KEY actually signify, and does my table need one?

I have a PostgreSQL 9.3 database with a users table that stores usernames in their case-preserved format. All queries will be case insensitive, so I should have an index that supports that. Additionally, usernames must be unique, regardless of case.
This is what I have come up with:
forum=> \d users
Table "public.users"
Column | Type | Modifiers
------------+--------------------------+------------------------
name | character varying(24) | not null
Indexes:
"users_lower_idx" UNIQUE, btree (lower(name::text))
Expressed in standard SQL syntax:
CREATE TABLE users (
name varchar(24) NOT NULL
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));
With this schema, I've satisfied all my constraints, albeit without a primary key. The SQL standard doesn't support functional primary keys, so I cannot promote the index:
forum=> ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_idx;
ERROR: index "users_lower_idx" contains expressions
LINE 1: ALTER TABLE users ADD PRIMARY KEY USING INDEX users_lower_id...
^
DETAIL: Cannot create a primary key or unique constraint using such an index.
But, I already have the UNIQUE constraint, and the column is already marked "NOT NULL." If I had to have a primary key, I could construct the table like this:
CREATE TABLE users (
name varchar(24) PRIMARY KEY
);
CREATE UNIQUE INDEX "users_lower_idx" ON users (lower(name));
But then I'll have two indexes, and that seems wasteful and unnecessary to me. So, does PRIMARY KEY mean anything special to postgres beyond "UNIQUE NOT NULL," and am I missing anything by not having one?
First off, practically every table should have a primary key.
citext
The additional module provides a data type of the same name. "ci" for case insensitive. Per documentation:
The citext module provides a case-insensitive character string type,
citext. Essentially, it internally calls lower when comparing
values. Otherwise, it behaves almost exactly like text.
It is intended for exactly the purpose you describe:
The citext data type allows you to eliminate calls to lower in SQL
queries, and allows a primary key to be case-insensitive.
Bold emphasis mine.
Be sure to read the manual about limitations first. Install it once per database with
CREATE EXTENSION citext;
text
If you don't want to go that route, I suggest you add a serial as surrogate primary key.
CREATE TABLE users (
user_id serial PRIMARY KEY
, username text NOT NULL
);
I would use text instead of varchar(24). Use a CHECK constraint if you need to enforce a maximum length (that may change at a later time). Details:
Any downsides of using data type "text" for storing strings?
Change PostgreSQL columns used in views
Along with the UNIQUE index in your original design (without type cast):
CREATE UNIQUE INDEX users_username_lower_idx ON users (lower(username));
The underlying integer of a serial is small and fast and does not have to waste time with lower() or the collation of your database. That's particularly useful for foreign key references. I mostly prefer that over some natural primary key with varying properties.
Both solutions have pros and cons.
I would suggest using a primary key, as you have stated you want something that is unique, and as you have demonstrated that you can put unique constraints on a username. I will assume that since this is a unique,not null username that you will use this to track your users in other parts of the Database, as well as allow usernames to be changed.
This is where a primary key will come in handy, instead of having to go into all of your tables and change the value of the Username column, you will only have one place to change it.
Example
Without primary key:
Table users
Username
'Test'
Table thingsdonebyUsers
RandomColumn AnotherColumn Username
RandomValue RandomValue Test
Now assume your user wants to change his username to Test1, well now you have to go find everywhere you used Username and change that to the new value before you change it in your users table since I'm assuming you will have a constraint there.
With Primary Key
Table users
PK Username
1 'Test'
Table thingsdonebyUsers
RandomColumn AnotherColumn PK_Users
RandomValue RandomValue 1
Now you can just change your users table and be done with the change.
You can still enforce unique and not null on your username column as you demonstrated.
This is just one of the many advantages of having normalized tables, which requires your tables to have a Primary Key that is an unrelated value(forget what the proper name is for this right now).
As for what a PK actually signifies, it just a non nullable unique column that identifies the row, so in this sense you already have a Primary Key on your table. The thing is that usually PKs are INT numbers because of the reason that I explained above.
Short answer: No, you don't need a declarative "PRIMARY KEY", since the UNIQUE index serves the same exact purpose.
Long answer:
The idea of having Primary Keys comes from database systems where the data is physically in key order. This requires having a single, "primary" key. MySQL InnoDB is this way, as are many older databases.
However, PostgreSQL does not keep the tables in key order; it separates the indexes, including the primary key index, from the heap, which is essentially unordered. As a result, in Postgres, there is no material difference between primary keys and unique indexes. You can even create a foreign key against a unique index, as long as that index covers the whole table.
That being said, some tools external to PostgreSQL look for primary keys and do not regard unique indexes as being equivalent. These tools may cause you issues because of not finding a PK.

Setting a nvarchar as a primary key for datatable relationships

I have data table in the data base with the columnEmail set as nvarchar(100)(becuse i couldn't set it as a primary key when it was nvarchar(MAX).
so it is primary key now, but i cant change it Identity Specification to yes, so I cant make a relationship with this table and another table when this is the primary key.
How can i make a relationship when this is as the primary key?
How can i set the Identity Specification to yes? or is there another way without doing it?
Thanks in advanced
The concept of identity applies to an integer column. The database will automatically assign an increasing number to each new row. An identity column is typically a primary key.
So it makes no sense for a varchar column to be an identity column. SSMS is right in graying the identity section out.
A foreign key can refer to any data type, including a varchar(100). A foreign key has to be indexed. A column that is a primary key always has an index on it.
The foreign key column and the column it references must have the same data type. Perhaps you could post the definition of the two tables you are trying to link.

Benefits of using Identity Column instead of Unique Key

I want to know the advantages of using the identity column on Unique Key. I have always used the identity column, and know it is better than Unique Key. But I want to know the benefits of Identity key over unique key.
Thanks
Are you trying to decide between using an identity column vs. a uniqueidentifier column as a primary key?
Uniqueidentifier (Guids) have the advantage of being unique across the entire environment, making it easier to uniquely identify a record outside of the context of a given table. They also provide a key you can expose in your applications which are not easily iterated.
Identity columns use less storage and are therefore faster and more efficient.
There is no right answer for their use. It really depends on the context.
A primary key can be used as a foreign key in another table, an identity column cannot. Without seeing the type of data you are trying to store, I would recommend using an identity column but also setting that as your primary key.

Can you use auto-increment in MySql with out it being the primary Key

I am using GUIDs as my primary key for all my other tables, but I have a requirement that needs to have an incrementing number. I tried to create a field in the table with the auto increment but MySql complained that it needed to be the primary key.
My application uses MySql 5, nhibernate as the ORM.
Possible solutions I have thought of are:
change the primary key to the auto-increment field but still have the Id as a GUID so the rest of my app is consistent.
create a composite key with both the GUID and the auto-increment field.
My thoughts at the moment are leaning towards the composite key idea.
EDIT: The Row ID (Primary Key) is the GUID currently. I would like to add an an INT Field that is Auto Incremented so that it is human readable. I just didn't want to move away from current standard in the app of having GUID's as primary-keys.
A GUID value is intended to be unique across tables and even databases so, make the auto_increment column primary index and make a UNIQUE index for the GUID
I would lean the other way.
Why? Because creating a composite key gives the impression to the next guy who comes along that it's OK to have the same GUID in the table twice but with different sequence numbers.
A couple of thoughts:
If your GUID is auntoincremental and unique, why not let it be the actual Primary Key?
On the other hand, you should never take semantical decisions based on programmatic problems: you have a problem with MySQL, not with the design of your DB.
So, a couple of workarounds here:
Creating a trigger that would set the GUID to the proper value once it's inserted. That's a MySQL solution to a MySQL problem, without altering semantics for your schema.
Before inserting, start a transaction (make sure auto commit is set to false), find out the latest GUID, increment and insert with the new value. In other words, auto-increment not automatically :P
GUID's are not intended to be orderable, that's why AUTO_INCREMENT for them does not make sense.
You may, though, use an AUTO_INCREMENT for a second column of a composite primary key in MyISAM tables. You can create a composite key over (GUID, INT) column and make the second column to be AUTO_INCREMENT.
To generate a new GUID, just call UUID() in an INSERT statement or in a trigger.
No, only the primary key can have auto_increment as its value.
If, for some reason, you can't change the identity column to be a primary key, what about manually generating the auto-increment via some kind of SEQUENCE table plus a trigger to query the SEQUENCE table and save the next value to use. Then assign the value to the destination table in the trigger. Same effect. The only question I would have is whether the auto-incremented value is going to make it back thru NHibernate without a re-select of the table.