I have an existing table that currently doesn't have an id column and a lot of duplicate rows on what should be a unique pair - it's messy. Example:
fips | customer_id
-------+------------
17043 | 2085
17043 | 2085
42091 | 4426
42091 | 4426
customer_id/fips should be unique, but the current code and schema don't enforce that. There also isn't an id column, so I have no unique way to reference a single row.
I'd like to add an id column and assign sequential integers so I can have a unique primary key. How can I go about that?
Postgres 10 added IDENTITY columns (as demonstrated in Gordon's answer).
In Postgres 9.6 (or any version) you can use use a serial column instead.
Either way, make it the PRIMARY KEY in the same command. That's cheaper for big tables:
ALTER TABLE tbl ADD COLUMN tbl_id serial PRIMARY KEY;
Or:
ALTER TABLE tbl ADD COLUMN tbl_id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY;
db<>fiddle here
IDENTITY columns are not PRIMARY KEY automatically. Postgres allows multiple IDENTITY columns for the same table (even if that's rarely useful).
See:
Auto increment table column
Or you clean up the mess to make (fips, customer_id) unique. Then that can be your PK. See:
How to delete duplicate rows without unique identifier
You can simply add an identity column:
alter table t add column id int generated always as identity;
Here is a db<>fiddle.
This question already has answers here:
Oracle SQL - Add Primary Key to table
(2 answers)
Closed 5 years ago.
this is my table.In this table i want to add a primary key column name "emp_id" as the first column .I don't know how to do it .So,can you please help me!
EMP_NAME EMP_POS SALARY GENDER
----------------- ----------------- -------------- ------
anand worker 10000 M
balu manager 50000 M
carl manager 50000 M
riya md 60000 F
prabhu owner 99999999 M
The old way of doing this is a multi-step process:
add the column which will be the primary key
update the column
enforce the primary key.
Something like this:
create sequence t23_id;
alter table t23 add id number;
update t23
set id = t23_id.nextval
;
alter table t23 add constraint t23_pk primary key (id);
In 12c Oracle added Identity columns (like SQL Server auto-incrementing columns). This reduces the number of steps to two:
alter table t23i add id number GENERATED ALWAYS AS IDENTITY primary key;
alter table t23i add constraint t23i_pk primary key (id);
Unfortunately it can't be done in one step. This ...
alter table t23i add id number GENERATED ALWAYS AS IDENTITY primary key;
...hurls ...
ORA-01758: table must be empty to add mandatory (NOT NULL) column
Livesql demo
Introduce identity column:
see http://sql-plsql.blogspot.sg/2014/11/add-identity-column-to-table.html
Reorder the columns:
see http://www.dba-oracle.com/t_change_column_order_within_oracle_table.htm
Hope these references help.
As I can understand documentation the following definitions are equivalent:
create table foo (
id serial primary key,
code integer,
label text,
constraint foo_uq unique (code, label));
create table foo (
id serial primary key,
code integer,
label text);
create unique index foo_idx on foo using btree (code, label);
However, a note in the manual for Postgres 9.4 says:
The preferred way to add a unique constraint to a table is ALTER TABLE ... ADD CONSTRAINT. The use of indexes to enforce unique constraints
could be considered an implementation detail that should not be
accessed directly.
(Edit: this note was removed from the manual with Postgres 9.5.)
Is it only a matter of good style? What are practical consequences of choice one of these variants (e.g. in performance)?
I had some doubts about this basic but important issue, so I decided to learn by example.
Let's create test table master with two columns, con_id with unique constraint and ind_id indexed by unique index.
create table master (
con_id integer unique,
ind_id integer
);
create unique index master_unique_idx on master (ind_id);
Table "public.master"
Column | Type | Modifiers
--------+---------+-----------
con_id | integer |
ind_id | integer |
Indexes:
"master_con_id_key" UNIQUE CONSTRAINT, btree (con_id)
"master_unique_idx" UNIQUE, btree (ind_id)
In table description (\d in psql) you can tell unique constraint from unique index.
Uniqueness
Let's check uniqueness, just in case.
test=# insert into master values (0, 0);
INSERT 0 1
test=# insert into master values (0, 1);
ERROR: duplicate key value violates unique constraint "master_con_id_key"
DETAIL: Key (con_id)=(0) already exists.
test=# insert into master values (1, 0);
ERROR: duplicate key value violates unique constraint "master_unique_idx"
DETAIL: Key (ind_id)=(0) already exists.
test=#
It works as expected!
Foreign keys
Now we'll define detail table with two foreign keys referencing to our two columns in master.
create table detail (
con_id integer,
ind_id integer,
constraint detail_fk1 foreign key (con_id) references master(con_id),
constraint detail_fk2 foreign key (ind_id) references master(ind_id)
);
Table "public.detail"
Column | Type | Modifiers
--------+---------+-----------
con_id | integer |
ind_id | integer |
Foreign-key constraints:
"detail_fk1" FOREIGN KEY (con_id) REFERENCES master(con_id)
"detail_fk2" FOREIGN KEY (ind_id) REFERENCES master(ind_id)
Well, no errors. Let's make sure it works.
test=# insert into detail values (0, 0);
INSERT 0 1
test=# insert into detail values (1, 0);
ERROR: insert or update on table "detail" violates foreign key constraint "detail_fk1"
DETAIL: Key (con_id)=(1) is not present in table "master".
test=# insert into detail values (0, 1);
ERROR: insert or update on table "detail" violates foreign key constraint "detail_fk2"
DETAIL: Key (ind_id)=(1) is not present in table "master".
test=#
Both columns can be referenced in foreign keys.
Constraint using index
You can add table constraint using existing unique index.
alter table master add constraint master_ind_id_key unique using index master_unique_idx;
Table "public.master"
Column | Type | Modifiers
--------+---------+-----------
con_id | integer |
ind_id | integer |
Indexes:
"master_con_id_key" UNIQUE CONSTRAINT, btree (con_id)
"master_ind_id_key" UNIQUE CONSTRAINT, btree (ind_id)
Referenced by:
TABLE "detail" CONSTRAINT "detail_fk1" FOREIGN KEY (con_id) REFERENCES master(con_id)
TABLE "detail" CONSTRAINT "detail_fk2" FOREIGN KEY (ind_id) REFERENCES master(ind_id)
Now there is no difference between column constraints description.
Partial indexes
In table constraint declaration you cannot create partial indexes.
It comes directly from the definition of create table ....
In unique index declaration you can set WHERE clause to create partial index.
You can also create index on expression (not only on column) and define some other parameters (collation, sort order, NULLs placement).
You cannot add table constraint using partial index.
alter table master add column part_id integer;
create unique index master_partial_idx on master (part_id) where part_id is not null;
alter table master add constraint master_part_id_key unique using index master_partial_idx;
ERROR: "master_partial_idx" is a partial index
LINE 1: alter table master add constraint master_part_id_key unique ...
^
DETAIL: Cannot create a primary key or unique constraint using such an index.
One more advantage of using UNIQUE INDEX vs. UNIQUE CONSTRAINT is that you can easily DROP/CREATE an index CONCURRENTLY, whereas with a constraint you can't.
Uniqueness is a constraint. It happens to be implemented via the creation
of a unique index since an index is quickly able to search all existing
values in order to determine if a given value already exists.
Conceptually the index is an implementation detail and uniqueness should be
associated only with constraints.
The full text
So speed performance should be same
Since various people have provided advantages of unique indexes over unique constraints, here's a drawback: a unique constraint can be deferred (only checked at the end of the transaction), a unique index can not be.
A very minor thing that can be done with constraints only and not with indexes is using the ON CONFLICT ON CONSTRAINT clause (see also this question).
This doesn't work:
CREATE TABLE T (a INT PRIMARY KEY, b INT, c INT);
CREATE UNIQUE INDEX u ON t(b);
INSERT INTO T (a, b, c)
VALUES (1, 2, 3)
ON CONFLICT ON CONSTRAINT u
DO UPDATE SET c = 4
RETURNING *;
It produces:
[42704]: ERROR: constraint "u" for table "t" does not exist
Turn the index into a constraint:
DROP INDEX u;
ALTER TABLE t ADD CONSTRAINT u UNIQUE (b);
And the INSERT statement now works.
Another thing I've encountered is that you can use sql expressions in unique indexes but not in constraints.
So, this does not work:
CREATE TABLE users (
name text,
UNIQUE (lower(name))
);
but following works.
CREATE TABLE users (
name text
);
CREATE UNIQUE INDEX uq_name on users (lower(name));
There is a difference in locking.
Adding an index does not block read access to the table.
Adding a constraint does put a table lock (so all selects are blocked) since it is added via ALTER TABLE.
I read this in the doc:
ADD table_constraint [ NOT VALID ]
This form adds a new constraint to a table using the same syntax as CREATE TABLE, plus the option NOT VALID, which is currently only allowed for foreign key constraints. If the constraint is marked NOT VALID, the potentially-lengthy initial check to verify that all rows in the table satisfy the constraint is skipped. The constraint will still be enforced against subsequent inserts or updates (that is, they'll fail unless there is a matching row in the referenced table). But the database will not assume that the constraint holds for all rows in the table, until it is validated by using the VALIDATE CONSTRAINT option.
So I think it is what you call "partial uniqueness" by adding a constraint.
And, about how to ensure the uniqueness:
Adding a unique constraint will automatically create a unique B-tree index on the column or group of columns listed in the constraint. A uniqueness restriction covering only some rows cannot be written as a unique constraint, but it is possible to enforce such a restriction by creating a unique partial index.
Note: The preferred way to add a unique constraint to a table is ALTER TABLE … ADD CONSTRAINT. The use of indexes to enforce unique constraints could be considered an implementation detail that should not be accessed directly. One should, however, be aware that there’s no need to manually create indexes on unique columns; doing so would just duplicate the automatically-created index.
So we should add constraint, which creates an index, to ensure uniqueness.
How I see this problem?
A "constraint" aims to gramatically ensure that this column should be unique, it establishes a law, a rule; while "index" is semantical, about "how to implement, how to achieve the uniqueness, what does unique means when it comes to implementation". So, the way Postgresql implements it, is very logical: first, you declare that a column should be unique, then, Postgresql adds the implementation of adding an unique index for you.
SELECT a.phone_number,count(*) FROM public.users a
Group BY phone_number Having count(*)>1;
SELECT a.phone_number,count(*) FROM public.retailers a
Group BY phone_number Having count(*)>1;
select a.phone_number from users a inner join users b
on a.id <> b.id and a.phone_number = b.phone_number order by a.id;
select a.phone_number from retailers a inner join retailers b
on a.id <> b.id and a.phone_number = b.phone_number order by a.id
DELETE FROM
users a
USING users b
WHERE
a.id > b.id
AND a.phone_number = b.phone_number;
DELETE FROM
retailers a
USING retailers b
WHERE
a.id > b.id
AND a.phone_number = b.phone_number;
CREATE UNIQUE INDEX CONCURRENTLY users_phone_number
ON users (phone_number);
To Verify:
insert into users(name,phone_number,created_at,updated_at) select name,phone_number,created_at,updated_at from users
I have a table called CustomerMemo:
CustomerMemo
CustomerID
MemoID
Both of these are foreign keys. The columns are not unique because there could be something like this:
CustomerID MemoID
----------- -------
1 1
1 2
1 3
However, what I want to avoid is something like this:
CustomerID MemoID
----------- -------
1 1
1 1
Anyone have a clue how to do this in SQL Server?
If you actually want to ignore duplicate keys upon insert, then you'll need to use the IGNORE_DUP_KEY index option in your index or unique constraint definition.
Here is the documentation on MSDN:
CREATE INDEX (Transact-SQL)
Example from that article (in section D. Using the IGNORE_DUP_KEY option):
CREATE TABLE #Test (C1 nvarchar(10), C2 nvarchar(50), C3 datetime);
GO
CREATE UNIQUE INDEX AK_Index ON #Test (C2)
WITH (IGNORE_DUP_KEY = ON);
GO
INSERT INTO #Test VALUES (N'OC', N'Ounces', GETDATE());
INSERT INTO #Test SELECT * FROM Production.UnitMeasure;
GO
SELECT COUNT(*)AS [Number of rows] FROM #Test;
GO
DROP TABLE #Test;
GO
For your table, this would be the command:
CREATE UNIQUE INDEX UNQ_CustomerMemo ON CustomerMemo (MemoID, CustomerID)
WITH (IGNORE_DUP_KEY = ON);
The disadvantage to using IGNORE_DUP_KEY is that you lose visibility on what data is violating the unique constraint. Generally it is better to ensure the data is unique before inserting and then when you do have something fall through the cracks, you will get the error, along with the values that violated the unique constraint. This will allow for much easier troubleshooting of your insert statement. That being said, I make liberal use of this option when defining table variables because the scope is limited.
As for whether or not you should use a unique key or a unique index, see the following question on stack overflow:
Unique key vs. unique index on SQL Server 2008
You want to use the DISTINCT keyword in your select.
Or, if you want to prevent there ever being a record with those same keys, you want a UNIQUE constraint on the CustomerID and MemoID
ALTER TABLE CustomerMemo
ADD CONSTRAINT [uc_CustomerMemo] UNIQUE(CustomerID, MemoID)
Another alternative is to make the primary key on the CustomerMemo a composite primary key on the CustomerID and MemoID
ALTER TABLE CustomerMemo
ADD CONSTRAINT pk_CustomerMemo PRIMARY KEY(CustomerID, MemoID)
Either set up a unique composite key that includes both CustomerID and MemoID, or make your primary key a composite of both CustomerID and MemoID. This will ensure you cannot insert duplicates like that.
Please check out this thread:
How can I create a SQL unique constraint based on 2 columns?
f.e. in sql server 2005 you should use:SQL Server 2005 Unique constraint on two columns
You should look for a phrase: UNIQUE CLUSTERED connected with your db implementation.
If I have a table where AId is the primary key and BId and CId are foreign keys referencing their tables. I need to make the combination of BId and CId unique.
How would I alter the table to make the combination unique?
Thanks
AId BId CId Notes Date
=== === === ===== ====
1 200 1 Random 2/2/2005
2 201 2 ETC 2/8/2007
3 202 3 ETC 2/12/2012
You need to create a unique index or a unique constraint.
I generally prefer unique indexes as they are more flexible (can add included columns if desired) and they don't have to be uniquely named in the schema.
Example syntax for a unique index
CREATE UNIQUE NONCLUSTERED INDEX SomeIndex ON YourTable(BId, CId)
The order of BId, CId makes no difference to the uniqueness guarantee but does affect the queries the index can efficiently support (that way round supports looking up by Bid or BId, CId but not CId)
Try this:
ALTER TABLE myTable
ADD CONSTRAINT myConstraint
UNIQUE NONCLUSTERED
(
BId, CId
)
You can create unique indexes that are not the primary key. This will force the combination of BId and CId unique.
CREATE UNIQUE INDEX ix_ATable_AltUniqueIndex ON ATable(BId,CId)
Or you can create a unique constraint
ALTER TABLE ATable ADD CONSTRAINT uni_ATable UNIQUE (BId,CId)
Create a unique constraint appears to create a unique index also.