Adding a constraint to prevent duplicates in SQL Update Trigger

Adding a constraint to prevent duplicates in SQL Update Trigger - sql

We have a user table, every user has an unique email and username. We try to do this within our code but we want to be sure users are never inserted (or updated) in the database with the same username of email.
I've added a BEFORE INSERT Trigger which prevents the insertion of duplicate users.
CREATE TRIGGER [dbo].[BeforeUpdateUser]
ON [dbo].[Users]
INSTEAD OF INSERT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE #Email nvarchar(MAX)
DECLARE #UserName nvarchar(MAX)
DECLARE #UserId int
DECLARE #DoInsert bit
SET #DoInsert = 1
SELECT #Email = Email, #UserName = UserName FROM INSERTED
SELECT #UserId = UserId FROM Users WHERE Email = #Email
IF (#UserId IS NOT NULL)
BEGIN
SET #DoInsert = 0
END
SELECT #UserId = UserId FROM Users WHERE UserName = #UserName
IF (#UserId IS NOT NULL)
BEGIN
SET #DoInsert = 0
END
IF (#DoInsert = 1)
BEGIN
INSERT INTO Users
SELECT
FirstName,
LastName,
Email,
Password,
UserName,
LanguageId,
Data,
IsDeleted
FROM INSERTED
END
ELSE
BEGIN
DECLARE #ErrorMessage nvarchar(MAX)
SET #ErrorMessage =
'The username and emailadress of a user must be unique!'
RAISERROR 50001 #ErrorMessage
END
END
But for the Update trigger I have no Idea how to do this.
I've found this example with google:
http://www.devarticles.com/c/a/SQL-Server/Using-Triggers-In-MS-SQL-Server/2/
But I don't know if it applies when you update multiple columns at once.
EDIT:
I've tried to add a unique constraint on these columns but it doesn't work:
Msg 1919, Level 16, State 1, Line 1
Column 'Email' in table 'Users' is of a type
that is invalid for use as a key column in an index.

You can add a unique contraint on the table, this will raise an error if you try and insert or update and create duplicates
ALTER TABLE [Users] ADD CONSTRAINT [IX_UniqueUserEmail] UNIQUE NONCLUSTERED
(
[Email] ASC
)
ALTER TABLE [Users] ADD CONSTRAINT [IX_UniqueUserName] UNIQUE NONCLUSTERED
(
[UserName] ASC
)
EDIT: Ok, i've just read your comments to another post and seen that you're using NVARCHAR(MAX) as your data type. Is there a reason why you might want more than 4000 characters for an email address or username? This is where your problem lies. If you reduce this to NVARCHAR(250) or thereabouts then you can use a unique index.

Sounds like a lot of work instead of just using one or more unique indexes. Is there a reason you haven't gone the index route?

Why not just use the UNIQUE attribute on the column in your database? Setting that will make the SQL server enforce that and throw an error if you try to insert a dupe.

You should use a SQL UNIQUE constraint on each of these columns for that.

You can create a UNIQUE INDEX on an NVARCHAR as soon as it's an NVARCHAR(450) or less.
Do you really need a UNIQUE column to be so large?

In general, I would avoid Triggers wherever possible as they can make the behaviour very hard to understand unless you know that the trigger exists. As other commentatators have said, a unique constraint is the way to go (once you have amended your column definitions to allow it).
If you ever find yourself needing to use a trigger, it may be a sign that your design is flawed. Think hard about why you need it and whether it is performing logic that belongs elsewhere.

Be aware that if you use the UNIQUE constraint/index solution with SQL Server, only one null value will be permitted in that column. So, for example, if you wanted the email address to be optional, it wouldn't work, because only one user could have a null email address. In that case, you would have to resort to another approach like a trigger or a filtered index.

Related

SQL - Unique key across 2 columns of same table?

I use SQL Server 2016. I have a database table called "Member".
In that table, I have these 3 columns (for the purpose of my question):
idMember [INT - Identity - Primary Key]
memEmail
memEmailPartner
I want to prevent a row to use an email that already exists in the table.
Both email columns are not mandatory, so they can be left blank (NULL).
If I create a new Member:
If not blank, the values entered for "memEmail" and "memEmailPartner" (independently) should not be found in any other rows in columns memEmail nor memEmailPartner.
So if I want to create a row with email (dominic#email.com) I must not find any occurrences of that value in memEmail or memEmailPartner.
If I update an existing Member:
I must not find any occurrences of that value in memEmail or memEmailPartner, with the exception that I am updating the row (idMembre) which already have the value in memEmail or memEmailPartner.
--
From what I read on Google, it should be possible to do something with a Function-Based Check Constraint but I can't make that work.
Anyone have a solution to my problem ?
Thank you.

I may have misunderstood exactly what you were asking but it looks like you want a simple upsert query with IF EXISTS conditions.
DECLARE #emailAddress VARCHAR(255)= 'dominic#email.com', --dummy value
#id INT= 2; --dummy value
IF NOT EXISTS
(
SELECT 1
FROM #Member
WHERE memEmail = #emailAddress
OR memEmailPartner = #emailAddress
)
BEGIN
SELECT 'insert';
END;
ELSE IF EXISTS
(
SELECT 1
FROM #Member
WHERE idMember = #id
)
BEGIN
SELECT 'update';
END;

A trigger is the traditional way of doing doing what you're asking for. Here's a simple demo;
--if object_id('member') is not null drop table member
go
create table member (
idMember INT Identity Primary Key,
memEmail varchar(100),
memEmailPartner varchar(100)
)
go
create trigger trg_member on member after insert, update as
begin
set nocount on
if exists (select 1 from member m join inserted i on i.memEmail = m.memEmail and i.idMember <> m.idMember) or
exists (select 1 from member m join inserted i on i.memEmail = m.memEmailPartner and i.idMember <> m.idMember) or
exists (select 1 from member m join inserted i on i.memEmailPartner = m.memEmail and i.idMember <> m.idMember) or
exists (select 1 from member m join inserted i on i.memEmailPartner = m.memEmailPartner and i.idMember <> m.idMember)
begin
raiserror('Email addresses must be unique.', 16, 1)
rollback
end
end
go
insert member(memEmail, memEmailPartner) values('a#a.com', null), ('b#b.com', null), (null, 'c#c.com'), (null, 'd#d.com')
go
select * from member
insert member(memEmail, memEmailPartner) values('a#a.com', null) -- should fail
go
insert member(memEmail, memEmailPartner) values(null, 'a#a.com') -- should fail
go
insert member(memEmail, memEmailPartner) values('c#c.com', null) -- should fail
go
insert member(memEmail, memEmailPartner) values(null, 'c#c.com') -- should fail
go
insert member(memEmail, memEmailPartner) values('e#e.com', null) -- should work
go
insert member(memEmail, memEmailPartner) values(null, 'f#f.com') -- should work
go
select * from member
-- Make sure updates still work!
update member set memEmail = memEmail, memEmailPartner = memEmailPartner
I've not tested this extensively but it should be enough to get you started if you want to try this approach.
StuartLC notes the potential for the UDF check constraint to fail in set based updates and/or various other conditions, triggers don't have this problem.
Stuart also suggests reconsidering whether this should really be a database constraint or managed through business logic elsewhere. I'm inclined to agree - my gut feel here is that sooner or later you will come across a situation that requires email addresses to be reused, or in some other way not strictly unique.

TL;DR
The wisdom of applying this kind of business rule logic in the database needs to be reconsidered - this check is likely a better candidate for your application, or a stored procedure which acts as an insert gate keeper instead of direct new row inserts into the table.
Ignoring the Warnings
That said, I do believe that what you want is however possible in a constraint UDF, albeit with potentially atrocious performance consequences*1, and likely prone to race conditions in set based updates
Here's a user defined function which applies the unique email logic across both columns. Note that by the time the constraint is checked, that the row is IN the table already, hence the new row itself needs to be excluded from the duplicate checks.
My code also is depedent on ANSI NULL behaviour, i.e. that the predicates NULL = NULL and X IN (NULL) both return NULL, and hence are excluded from the failure check (in order to meet your requirement that NULLS do not fail the rule).
We also need to check for the insert of BOTH new columns being non-null, but duplicated.
So here's the a UDF doing the checking:
CREATE FUNCTION dbo.CheckUniqueEmails(#id int, #memEmail varchar(50),
#memEmailPartner varchar(50))
RETURNS bit
AS
BEGIN
DECLARE #retval bit;
IF #memEmail = #memEmailPartner
OR EXISTS (SELECT 1 FROM MyTable WHERE memEmail IS NOT NULL
AND memEmail IN(#memEmail, #memEmailPartner) AND idMember <> #id)
OR EXISTS (SELECT 1 FROM MyTable WHERE memEmailPartner IS NOT NULL
AND memEmailPartner IN(#memEmail, #memEmailPartner) AND idMember <> #id)
SET #retval = 0
ELSE
SET #retval = 1;
RETURN #retval;
END;
GO
Which is then enforced in a CHECK constraint:
ALTER TABLE MyTable ADD CHECK (dbo.CheckUniqueEmails(
idMember, memEmail, memEmailPartner) = 1);
I've put a SQLFiddle up here
Uncomment the 'failed' test cases to ensure that the above check constraint is working.
I haven't tested this with updates, and as per Martin's advice on the link, this will likely break on an insert with multiple rows.
*1 - we'll need indexes on BOTH email address columns.

Conditional INSERT subquery of larger insert

I have a set of tables which track access logs. The logs contain data about the user's access including user agent strings. Since we know that user agent strings are, for all intents and purposes, practically unlimited, these would need to be stored as a text/blob type. Given the high degree of duplication, I'd like to store these in a separate reference table and have my main access log table have an id linking to it. Something like this:
accesslogs table:
username|accesstime|ipaddr|useragentid
useragents table:
id|crc32|md5|useragent
(the hashes are for indexing and quicker searching)
Here's the catch, i am working inside a framework that doesn't give me access to create fancy things like foreign keys. In addition, this has to be portable across multiple DBMSs. I have the join logic worked out for doing SELECTS but I am having trouble figuring out how to insert properly. I want to do something like
INSERT INTO accesslogs (username, accesstime, ipaddr, useragentid)
VALUES
(
:username,
:accesstime,
:ipaddr,
(
CASE WHEN
(
SELECT id
FROM useragents
WHERE
useragents.crc32 = :useragentcrc32
AND
useragents.md5 = :useragentmd5
AND useragents.useragent LIKE :useragent
) IS NOT NULL
THEN
THAT_SAME_SELECT_FROM_ABOVE()
ELSE
GET_INSERT_ID_FROM(INSERT INTO useragents (crc32, md5, useragent) VALUES (:useragentcrc32, :useragentmd5, :useragent))
)
)
Is there any way to do this that doesn't use pseudofunctions whose names i just made up? The two parts i'm missing is how to get the select from above and how to get the new id from a subquery insert.

You will need to do separate inserts to each of the tables. You can not do insert into both at the same time.
If you use MS SQL Server once you inserted you can get inserted id by SCOPE_IDENTITY(), and then use it in another table insert.

I'm not sure there is a cross platform way of doing this. You may have to have a lot of special cases for each supported back end. For Example, for SQL Server you'd use the merge statement as the basis of the solution. Other DBMSs have different names if they support it at all. Searching for "Upsert" might help.
Edt - added the second query to be explicit, and added parameters.
-- SQL Server Example
--Schema Defs
Create Table Test (
id int not null identity primary key,
UserAgent nvarchar(50)
)
Create Table WebLog (
UserName nvarchar(50),
APAddress nvarchar(50),
UserAgentID int
)
Create Unique Index UQ_UserAgent On Test(UserAgent)
-- Values parsed from log
Declare
#UserName nvarchar(50) = N'Loz',
#IPAddress nvarchar(50) = N'1.1.1.1',
#UserAgent nvarchar(50) = 'Test'
Declare #id int
-- Optionally Begin Transaction
-- Insert if necessary and get id
Merge
Into dbo.Test as t
Using
(Select #UserAgent as UserAgent) as s
On
t.[UserAgent] = s.[UserAgent]
When Matched Then
Update Set #id = t.id
When Not Matched Then
Insert (UserAgent) Values (s.UserAgent);
If #id Is Null Set #id = scope_identity()
Insert Into WebLog (UserName, IPAddress, UserAgentID) Values (#UserName, #IPAddress, #id)
-- Optionally Commit Transaction

Stored procedures

I have a stored procedure that inserts a user into a table but I want an output value equals to the new inserted UserID in the table but I don't know how to do it can you guys help me?
I have this
ALTER PROCEDURE dbo.st_Insert_User
(
#Nombre varchar(200),
#Usuario varchar(100),
#Password varchar(100),
#Administrador bit,
#Resultado int output
)
AS
INSERT INTO tbl_Users(Nombre, Usuario, Password, Administrador)
VALUES(#Nombre, #Usuario, #Password, #Administrador)
SELECT #resultado = UserID
I also tried
SELECT #resultado = UserID FROM tbl_Users WHERE Usuario = #Usuario

SELECT SCOPE_IDENTITY()
will give you the identity of the row

For SQL Server, you want to use the OUTPUT clause. See information and examples in Books Online here. It does cover your case-- as it mentions "The OUTPUT clause may be useful to retrieve the value of identity or computed columns after an INSERT or UPDATE operation."
(If this is for real world purposes, you do of course have security concerns in storing passwords that you should address.)

Add at the end
select ##Identity

SQL Table Locking

I have an SQL Server locking question regarding an application we have in house. The application takes submissions of data and persists them into an SQL Server table. Each submission is also assigned a special catalog number (unrelated to the identity field in the table) which is a sequential alpha numeric number. These numbers are pulled from another table and are not generated at run time. So the steps are
Insert Data into Submission Table
Grab next Unassigned Catalog
Number from Catalog Table
Assign the Catalog Number to the
Submission in the Submission table
All these steps happen sequentially in the same stored procedure.
Its, rate but sometimes we manage to get two submission at the same second and they both get assigned the same Catalog Number which causes a localized version of the Apocalypse in our company for a small while.
What can we do to limit the over assignment of the catalog numbers?

When getting your next catalog number, use row locking to protect the time between you finding it and marking it as in use, e.g.:
set transaction isolation level REPEATABLE READ
begin transaction
select top 1 #catalog_number = catalog_number
from catalog_numbers with (updlock,rowlock)
where assigned = 0
update catalog_numbers set assigned = 1 where catalog_number = :catalog_number
commit transaction

You could use an identity field to produce the catalog numbers, that way you can safely create and get the number:
insert into Catalog () values ()
set #CatalogNumber = scope_identity()
The scope_identity function will return the id of the last record created in the same session, so separate sessions can create records at the same time and still end up with the correct id.
If you can't use an identity field to create the catalog numbers, you have to use a transaction to make sure that you can determine the next number and create it without another session accessing the table.

I like araqnid's response. You could also use an insert trigger on the submission table to accomplish this. The trigger would be in the scope of the insert, and you would effectively embed the logic to assign the catalog_number in the trigger. Just wanted to put your options up here.

Here's the easy solution. No race condition. No blocking from a restrictive transaction isolation level. Probably won't work in SQL dialects other than T-SQL, though.
I assume their is some outside force at work to keep your catalog number table populated with unassigned catalog numbers.
This technique should work for you: just do the same sort of "interlocked update" that retrieves a value, something like:
update top 1 CatalogNumber
set in_use = 1 ,
#newCatalogNumber = catalog_number
from CatalogNumber
where in_use = 0
Anyway, the following stored procedure just just ticks up a number on each execution and hands back the previous one. If you want fancier value, add a computed column that applies the transform of choice to the incrementing value to get the desired value.
drop table dbo.PrimaryKeyGenerator
go
create table dbo.PrimaryKeyGenerator
(
id varchar(100) not null ,
current_value int not null default(1) ,
constraint PrimaryKeyGenerator_PK primary key clustered ( id ) ,
)
go
drop procedure dbo.GetNewPrimaryKey
go
create procedure dbo.GetNewPrimaryKey
#name varchar(100)
as
set nocount on
set ansi_nulls on
set concat_null_yields_null on
set xact_abort on
declare
#uniqueValue int
--
-- put the supplied key in canonical form
--
set #name = ltrim(rtrim(lower(#name)))
--
-- if the name isn't already defined in the table, define it.
--
insert dbo.PrimaryKeyGenerator ( id )
select id = #name
where not exists ( select *
from dbo.PrimaryKeyGenerator pkg
where pkg.id = #name
)
--
-- now, an interlocked update to get the current value and increment the table
--
update PrimaryKeyGenerator
set #uniqueValue = current_value ,
current_value = current_value + 1
where id = #name
--
-- return the new unique value to the caller
--
return #uniqueValue
go
To use it:
declare #pk int
exec #pk = dbo.GetNewPrimaryKey 'foobar'
select #pk
Trivial to mod it to return a result set or return the value via an OUTPUT parameter.

Check constraint with only one value SQL Server 2005

I have a table with these fields:
User_id, User_type, User_address
Is it possible to add a constraint where only one record can exist where user_type = 'xyz' per user_id? There can be as many user_type = 'abc' as we wish but only one 'xyz'.
I know that this is not the greatest design but it is what is there currently and I need to lock it down a bit.
Thanks

you'll need to use a trigger...
CREATE TRIGGER yourTriggerName ON YourTableName
AFTER INSERT,UPDATE
AS
IF EXISTS (SELECT
y.User_id --,COUNT(y.User_Type)
FROM YourTableName y
INNER JOIN inserted i ON y.User_id=i.User_id
WHERE y.User_Type='xyz'
GROUP BY y.User_id
HAVING COUNT(y.User_Type)>1
)
BEGIN
ROLLBACK
END
go
also, make sure there is an index on User_id+User_type

A very common question. My canned answer:
Use Computed Columns to Implement Complex Business Rules
You can also use an indexed view to accomplish the same. Note that wrapping a UDF in a check constraint may not work if you modify more than one row at a time or if you use snapshot isolation:
Scalar UDFs wrapped in CHECK constraints are very slow and may fail for multirow updates
Why am I recommending an index, not a trigger?
Because if I have an index I am 100% sure all my data is clean. With triggers, it is not the case. Sometimes triggers do not fire, sometimes they have bugs. Another trigger can override this one.

I had the same idea as Daniel, but I think your constraint as you put it needs to check for at most 1 XYZ type PER USER:
CREATE FUNCTION CheckUserTypeXyzExistAtMostOnce(#User_id int)
RETURNS bit
AS
BEGIN
DECLARE #count int
SELECT #count = COUNT(*) FROM dbo.MyTable WHERE User_id = #User_id AND User_type = 'xyz'
RETURN #count <= 1
END;
ALTER TABLE dbo.MyTableADD CONSTRAINT UserTypeConstraint CHECK (dbo.CheckUserTypeXyzExistAtMostOnce(User_id));

I'm not sure its the best way but you could always create a insert/update trigger

You can use a check constraint to enforce this rule.
CREATE FUNCTION CheckUserTypeXyzExistAtMostOnce()
RETURNS bit
AS
BEGIN
DECLARE #count int
SELECT #count = COUNT(*) FROM dbo.MyTable WHERE UserType = 'xyz'
RETURN #count <= 1
END;
ALTER TABLE dbo.MyTable
ADD CONSTRAINT UserTypeConstraint CHECK (dbo.CheckUserTypeXyzExistAtMostOnce());

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Adding a constraint to prevent duplicates in SQL Update Trigger - sql

Sounds like a lot of work instead of just using one or more unique indexes. Is there a reason you haven't gone the index route?

Why not just use the UNIQUE attribute on the column in your database? Setting that will make the SQL server enforce that and throw an error if you try to insert a dupe.

You should use a SQL UNIQUE constraint on each of these columns for that.

You can create a UNIQUE INDEX on an NVARCHAR as soon as it's an NVARCHAR(450) or less. Do you really need a UNIQUE column to be so large?

Related

SQL - Unique key across 2 columns of same table?

Conditional INSERT subquery of larger insert

Stored procedures

SQL Table Locking

Check constraint with only one value SQL Server 2005

Categories

Resources