Uniqueidentifier Philosophy in my scneario - sql

i have a three column table which none of the columns are unique. so I added an Id column to my table as uniqueidentifier and made it primary. Just One Quesion: If two rows with the same values in those columns will be added to the table, what happens to the second one? will it be added or not? How Can I avoid such things in my scenario?

If your three non-ID columns do not have a unique constraint, and are not part of the Primary Key, you can add multiple rows with the exact same values.
If you want to avoid duplicate rows, your best bet would be to apply a unique constraint on at least one or more of the three columns. If you can't do that for some reason, you should post your schema and where you think the problem is to get some more help with it.
If you can't use a unique constraint, one way to avoid duplicates would be: when you're going to insert a record, first check to see if the record already exists, and if it already exists, decide how you want to handle it.

If your goal is to enforce uniqueness across the three columns then you basically have a choice of two structures:
Create Table tbl_1
(
ColA int
,ColB varchar(32)
,ColC varchar(256)
,Primary Key (ColA, ColB, ColC)
)
GO
OR
Create Table tbl_2
(
ID int identity(1,1) Primary Key
,ColA int
,ColB varchar(32)
,ColC varchar(256)
,Unique (ColA, ColB, ColC)
)
GO
There are advantages to either technique, which is better depends on the nature of the data and your interactions with it. However in both structures the values in any given column can repeat but the combination of the three must be unique for each row.
On the other hand if you are just trying to set a PK on data which may or may not be unique then you can use structure 2 but without the unique constraint. In which case each row is uniquely identified by the ID column but the values of the other columns can repeat freely.

The second one will be added. You can avoid it by creating a unique constraint.

Related

Sql combine value of two columns as primary key

I have a SQL server table on which I insert account wise data. Same account number should not be repeated on the same day but can be repeated if the date changes.
The customer retrieves the data based on the date and account number.
In short the date + account number is unique and should not be duplicate.
As both are different fields should I concatenate both and create a third field as primary key or there is option of having a primary key on the merge value.
Please guide with the optimum way.
You can create a composite primary key. When you create the table, you can do this sort of thing in SQL Server;
CREATE TABLE TableName (
Field1 varchar(20),
Field2 INT,
PRIMARY KEY (Field1, Field2))
Take a look at this question which helps with each flavour of SQL
How can I define a composite primary key in SQL?
PLEASE HAVE A LOOK, IT WILL CLEAR MOST OF THE DOUBTS !
We can state 2 or more columns combined as a primary key.
In that case every column included in primary key will be called : Composite Key
And mind you Composite keys can never be null !!
Now, first let me show you how to make 2 or more columns as primary key.
create table table_name ( col1 type, col2 type, primary key(col1, col2));
The benefit is :
col1 has value (X) and col2 has value (Y) then no other row can have col1 as (X) and col2 as (Y).
col1, col2 must have some values, they can't be null !!
HOPE THIS HELPS !
Not at all. Just use a primary key constraint:
alter table t add constraint pk_accountnumber_date primary key (accountnumber, date)
You can also include this in the create table statement.
I might suggest, however, that you use an auto-incrementing/identity/serial primary key -- a unique number for each row. Then declare the account number/date combination as a unique key. I prefer such synthetic primary keys for several reasons:
They make it easy to refer to a row in foreign key relationships.
They show the insert order into the table, so you can readily see the last inserted rows.
They make it simple to identify a single row for updates and deletes.
They hide the "id" information of the row from referring tables and applications.
The alternative is to have a PK which is an autoincrementing number and then put a unique unique index on the natural key. In this way uniqueness is preserved but you have the fastest possible joining to any child tables. If the table will not ever have child tables, the composite PK is a good idea. If there will be many child tables, this is could be a better choice.

How to ignore duplicate Primary Key in SQL?

I have an excel sheet with several values which I imported into SQL (book1$) and I want to transfer the values into ProcessList. Several rows have the same primary keys which is the ProcessID because the rows contain original and modified values, both of which I want to keep. How do I make SQL ignore the duplicate primary keys?
I tried the IGNORE_DUP_KEY = ON but for rows with duplicated primary key, only 1 the latest row shows up.
CREATE TABLE dbo.ProcessList
(
Edited varchar(1),
ProcessId int NOT NULL PRIMARY KEY WITH (IGNORE_DUP_KEY = ON),
Name varchar(30) NOT NULL,
Amount smallmoney NOT NULL,
CreationDate datetime NOT NULL,
ModificationDate datetime
)
INSERT INTO ProcessList SELECT Edited, ProcessId, Name, Amount, CreationDate, ModificationDate FROM Book1$
SELECT * FROM ProcessList
Also, if I have a row and I update the values of that row, is there any way to keep the original values of the row and insert a clone of that row below, with the updated values and creation/modification date updated automatically?
How do I make SQL ignore the duplicate primary keys?
Under no circumstances can a transaction be committed that results in a table containing two distinct rows with the same primary key. That is fundamental to the nature of a primary key. SQL Server's IGNORE_DUP_KEY option does not change that -- it merely affects how SQL Server handles the problem. (With the option turned on it silently refuses to insert rows having the same primary key as any existing row; otherwise, such an insertion attempt causes an error.)
You can address the situation either by dropping the primary key constraint or by adding one or more columns to the primary key to yield a composite key whose collective value is not duplicated. I don't see any good candidate columns for an expanded PK among those you described, though. If you drop the PK then it might make sense to add a synthetic, autogenerated PK column.
Also, if I have a row and I update the values of that row, is there any way to keep the original values of the row and insert a clone of that row below, with the updated values and creation/modification date updated automatically?
If you want to ensure that this happens automatically, however a row happens to be updated, then look into triggers. If you want a way to automate it, but you're willing to make the user ask for the behavior, then consider a stored procedure.
try this
INSERT IGNORE INTO ProcessList SELECT Edited, ProcessId, Name, Amount, CreationDate, ModificationDate FROM Book1$
SELECT * FROM ProcessList
You drop the constraint. Something like this:
alter table dbo.ProcessList drop constraint PK_ProcessId;
You need to know the constraint name.
In other words, you can't ignore a primary key. It is defined as unique and not-null. If you want the table to have duplicates, then that is not the primary key.

How to force duplicate key insert mssql

I know this sounds crazy (And if I designed the database I would have done it differently) but I actually want to force a duplicate key on an insert. I'm working with a database that was designed to have columns as 'not null' pk's that have the same value in them for every row. The records keeping software I'm working with is somehow able to insert dups into these columns for every one of its records. I need to copy data from a column in another table into one column on this one. Normally I just would try to insert into that column only, but the pk's are set to 'not null' so I have to put something in them, and the way the table is set up that something has to be the same thing for every record. This should be impossible but the company that made the records keeping software made it work some how. I was wondering if anyone knows how this could be done?
P.S. I know this is normally not a good idea at all. So please just include suggestions for how this could be done regardless of how crazy it is. Thank you.
A SQL Server primary key has to be unique and NOT NULL. So, the column you're seeing duplicate data in cannot be the primary key on it's own. As urlreader suggests, it must be part of a composite primary key with one or more other columns.
How to tell what columns make up the primary key on a table: In Enterprise Manager, expand the table and then expand Columns. The primary key columns will have a "key" symbol next to them. You'll also see "PK" in the column description after, like this:
MyFirstIDColumn (PK, int, not null)
MySecondIDColumn (PK, int, not null)
Once you know which columns make up the primary key, simply ensure that you are inserting a combination of unique data into the columns. So, for my sample table above, that would be:
INSERT INTO MyTable (MyFirstIDColumn, MySecondIDColumn) VALUES (1,1) --SUCCEED
INSERT INTO MyTable (MyFirstIDColumn, MySecondIDColumn) VALUES (1,2) --SUCCEED
INSERT INTO MyTable (MyFirstIDColumn, MySecondIDColumn) VALUES (1,1) --FAIL because of duplicate (1,1)
INSERT INTO MyTable (MyFirstIDColumn, MySecondIDColumn) VALUES (1,3) --SUCCEED
More on primary keys:
http://msdn.microsoft.com/en-us/library/ms191236%28v=sql.105%29.aspx

Should I use a unique constraint in a table even though it isn't necessarily required?

In Microsoft SQL Server, when creating tables, are there any downsides to using a unique constraint on a column even though you don't really need it to be unique?
An example would be descriptions for say a role in a user management system:
CREATE TABLE Role
(
ID TINYINT PRIMARY KEY NOT NULL IDENTITY(0, 1),
Title CHARACTER VARYING(32) NOT NULL UNIQUE,
Description CHARACTER VARYING(MAX) NOT NULL UNIQUE
)
My fear is that validating this constraint when doing frequent insertions in other tables will be a very time consuming process. I am unsure as to how this constraint is validated, but I feel like it could be done in a very efficient way or as a linear comparison.
Your fear becomes true: UNIQUE constraint are implemented as indices, and this is time and space consuming.
So, whenever you insert a new row, the database have to update the table, and also one index for each unique constraint.
So, according to you:
using a unique constraint on a column even though you don't really need it to be unique
the answer is no, don't use it. there are time and space downsides.
Your sample table would need a clustered index for the Id, and 2 extra indices, one for each unique constraint. This takes up space, and time to update the 3 indices on the inserts.
This would only be justified if you made queries filtering by those fields.
BY THE WAY:
The original post sample table have several flaws:
that syntax is not SQL Server syntax (and you tagged this as SQL Server)
you cannot create an index in a varchar(max) column
If you correct the syntax and create this table:
CREATE TABLE Role
(
ID tinyint PRIMARY KEY NOT NULL IDENTITY(0, 1),
Title varchar(32) NOT NULL UNIQUE,
Description varchar(32) NOT NULL UNIQUE
)
You can then execute sp_help Role and you'll find the 3 indices.
The database creates an index which backs up the UNIQUE constraint, so it should be very low-cost to do the uniqueness check.
http://msdn.microsoft.com/en-us/library/ms177420.aspx
The Database Engine automatically creates a UNIQUE index to enforce the uniqueness requirement of the UNIQUE constraint. Therefore, if an attempt to insert a duplicate row is made, the Database Engine returns an error message that states the UNIQUE constraint has been violated and does not add the row to the table. Unless a clustered index is explicitly specified, a unique, nonclustered index is created by default to enforce the UNIQUE constraint.
Is it typically a good practice to constrain it if you know the data
will always be unique but it doesn't necessarily need to be unique for
the application to function correctly?
My question to you: would it make sense for two roles to have different titles but the same description? e.g.
INSERT INTO Role ( Title , Description )
VALUES ( 'CEO' , 'Senior manager' ),
( 'CTO' , 'Senior manager' );
To me it would seem to devalue the use of the description; if there were many duplications then it might make more sense to do something more like this:
INSERT INTO Role ( Title )
VALUES ( 'CEO' ),
( 'CTO' );
INSERT INTO SeniorManagers ( Title )
VALUES ( 'CEO' ),
( 'CTO' );
But then again you are not expecting duplicates.
I assume this is a low activity table. You say you fear validating this constraint when doing frequent insertions in other tables. Well, that will not happen (unless there is a trigger we cannot see that might update this table when another table is updated).
Personally, I would ask the designer (business analyst, whatever) to justify not applying a unique constraint. If they cannot then I would impose the unqiue constraint based on common sense. As is usual for such a text column, I would also apply CHECK constraints e.g. to disallow leading/trailing/double spaces, zero-length string, etc.
On SQL Server, the data type tinyint only gives you 256 distinct values. No matter what you do outside of the id column, you're not going to end up with a very big table. It will surely perform quickly even with a dozen indexed columns.
You usually need at least one unique constraint besides the surrogate key, though. If you don't have one, you're liable to end up with data like this.
1 First title First description
2 First title First description
3 First title First description
...
17 Third title Third description
18 First title First description
Tables that permit data like that are usually wrong. Any table that uses foreign key references to this table won't be able to report correctly, say, the number of "First title" used.
I'd argue that allowing multiple, identical titles for roles in a user management system is a design error. I'd probably argue that "title" is a really bad name for that column, too.

How to create a unique index on a NULL column?

I am using SQL Server 2005. I want to constrain the values in a column to be unique, while allowing NULLS.
My current solution involves a unique index on a view like so:
CREATE VIEW vw_unq WITH SCHEMABINDING AS
SELECT Column1
FROM MyTable
WHERE Column1 IS NOT NULL
CREATE UNIQUE CLUSTERED INDEX unq_idx ON vw_unq (Column1)
Any better ideas?
Using SQL Server 2008, you can create a filtered index.
CREATE UNIQUE INDEX AK_MyTable_Column1 ON MyTable (Column1) WHERE Column1 IS NOT NULL
Another option is a trigger to check uniqueness, but this could affect performance.
The calculated column trick is widely known as a "nullbuster"; my notes credit Steve Kass:
CREATE TABLE dupNulls (
pk int identity(1,1) primary key,
X int NULL,
nullbuster as (case when X is null then pk else 0 end),
CONSTRAINT dupNulls_uqX UNIQUE (X,nullbuster)
)
Pretty sure you can't do that, as it violates the purpose of uniques.
However, this person seems to have a decent work around:
http://sqlservercodebook.blogspot.com/2008/04/multiple-null-values-in-unique-index-in.html
It is possible to use filter predicates to specify which rows to include in the index.
From the documentation:
WHERE <filter_predicate> Creates a filtered index by specifying which
rows to include in the index. The filtered index must be a
nonclustered index on a table. Creates filtered statistics for the
data rows in the filtered index.
Example:
CREATE TABLE Table1 (
NullableCol int NULL
)
CREATE UNIQUE INDEX IX_Table1 ON Table1 (NullableCol) WHERE NullableCol IS NOT NULL;
Strictly speaking, a unique nullable column (or set of columns) can be NULL (or a record of NULLs) only once, since having the same value (and this includes NULL) more than once obviously violates the unique constraint.
However, that doesn't mean the concept of "unique nullable columns" is valid; to actually implement it in any relational database we just have to bear in mind that this kind of databases are meant to be normalized to properly work, and normalization usually involves the addition of several (non-entity) extra tables to establish relationships between the entities.
Let's work a basic example considering only one "unique nullable column", it's easy to expand it to more such columns.
Suppose we the information represented by a table like this:
create table the_entity_incorrect
(
id integer,
uniqnull integer null, /* we want this to be "unique and nullable" */
primary key (id)
);
We can do it by putting uniqnull apart and adding a second table to establish a relationship between uniqnull values and the_entity (rather than having uniqnull "inside" the_entity):
create table the_entity
(
id integer,
primary key(id)
);
create table the_relation
(
the_entity_id integer not null,
uniqnull integer not null,
unique(the_entity_id),
unique(uniqnull),
/* primary key can be both or either of the_entity_id or uniqnull */
primary key (the_entity_id, uniqnull),
foreign key (the_entity_id) references the_entity(id)
);
To associate a value of uniqnull to a row in the_entity we need to also add a row in the_relation.
For rows in the_entity were no uniqnull values are associated (i.e. for the ones we would put NULL in the_entity_incorrect) we simply do not add a row in the_relation.
Note that values for uniqnull will be unique for all the_relation, and also notice that for each value in the_entity there can be at most one value in the_relation, since the primary and foreign keys on it enforce this.
Then, if a value of 5 for uniqnull is to be associated with an the_entity id of 3, we need to:
start transaction;
insert into the_entity (id) values (3);
insert into the_relation (the_entity_id, uniqnull) values (3, 5);
commit;
And, if an id value of 10 for the_entity has no uniqnull counterpart, we only do:
start transaction;
insert into the_entity (id) values (10);
commit;
To denormalize this information and obtain the data a table like the_entity_incorrect would hold, we need to:
select
id, uniqnull
from
the_entity left outer join the_relation
on
the_entity.id = the_relation.the_entity_id
;
The "left outer join" operator ensures all rows from the_entity will appear in the result, putting NULL in the uniqnull column when no matching columns are present in the_relation.
Remember, any effort spent for some days (or weeks or months) in designing a well normalized database (and the corresponding denormalizing views and procedures) will save you years (or decades) of pain and wasted resources.