Prevent circular reference in MS-SQL table - sql

I have a Account table with ID and ParentAccountID. Here is the scripts to reproduce the steps.
If the ParentAccountID is NULL then that is considered as Top level account.
Every account should finally ends with top level account i.e ParentAccountID is NULL
Declare #Accounts table (ID INT, ParentAccountID INT )
INSERT INTO #Accounts values (1,NULL), (2,1), (3,2) ,(4,3), (5,4), (6,5)
select * from #Accounts
-- Request to update ParentAccountID to 6 for the ID 3
update #Accounts
set ParentAccountID = 6
where ID = 3
-- Now the above update will cause circular reference
select * from #Accounts
When request comes like to update ParentAccountID of an account, if that cause circular reference then before update its need to identified.
Any idea folks!!

It seems you've got some business rules defined for your table:
All chain must end with a top-level account
A chain may not have a circular reference
You have two ways to enforce this.
You can create a trigger in your database, and check the logic in the trigger. This has the benefit of running inside the database, so it applies to every transaction, regardless of the client. However, database triggers are not always popular. I see them as a side effect, and they can be hard to debug. Triggers run as part of your SQL, so if they are slow, your SQL will be slow.
The alternative is to enforce this logic in the application layer - whatever is talking to your database. This is easier to debug, and makes your business logic explicit to new developers - but it doesn't run inside the database, so you could end up replicating the logic if you have multiple client applications.

Here is an example that you could use as a basis to implement a database constraint that should prevent circular references in singular row updates; I don't believe this will work to prevent a circular reference if multiple rows are updated.
/*
ALTER TABLE dbo.Test DROP CONSTRAINT chkTest_PreventCircularRef
GO
DROP FUNCTION dbo.Test_PreventCircularRef
GO
DROP TABLE dbo.Test
GO
*/
CREATE TABLE dbo.Test (TestID INT PRIMARY KEY,TestID_Parent INT)
INSERT INTO dbo.Test(TestID,TestID_Parent) SELECT 1 AS TestID,NULL AS TestID_Parent
INSERT INTO dbo.Test(TestID,TestID_Parent) SELECT 2 AS TestID,1 AS TestID_Parent
INSERT INTO dbo.Test(TestID,TestID_Parent) SELECT 3 AS TestID,2 AS TestID_Parent
INSERT INTO dbo.Test(TestID,TestID_Parent) SELECT 4 AS TestID,3 AS TestID_Parent
INSERT INTO dbo.Test(TestID,TestID_Parent) SELECT 5 AS TestID,4 AS TestID_Parent
GO
GO
CREATE FUNCTION dbo.Test_PreventCircularRef (#TestID INT,#TestID_Parent INT)
RETURNS INT
BEGIN
--FOR TESTING:
--SELECT * FROM dbo.Test;DECLARE #TestID INT=3,#TestID_Parent INT=4
DECLARE #ParentID INT=#TestID
DECLARE #ChildID INT=NULL
DECLARE #RetVal INT=0
DECLARE #Ancestors TABLE(TestID INT)
DECLARE #Descendants TABLE(TestID INT)
--Get all descendants
INSERT INTO #Descendants(TestID) SELECT TestID FROM dbo.Test WHERE TestID_Parent=#TestID
WHILE (##ROWCOUNT>0)
BEGIN
INSERT INTO #Descendants(TestID)
SELECT t1.TestID
FROM dbo.Test t1
LEFT JOIN #Descendants relID ON relID.TestID=t1.TestID
WHERE relID.TestID IS NULL
AND t1.TestID_Parent IN (SELECT TestID FROM #Descendants)
END
--Get all ancestors
--INSERT INTO #Ancestors(TestID) SELECT TestID_Parent FROM dbo.Test WHERE TestID=#TestID
--WHILE (##ROWCOUNT>0)
--BEGIN
-- INSERT INTO #Ancestors(TestID)
-- SELECT t1.TestID_Parent
-- FROM dbo.Test t1
-- LEFT JOIN #Ancestors relID ON relID.TestID=t1.TestID_Parent
-- WHERE relID.TestID IS NULL
-- AND t1.TestID_Parent IS NOT NULL
-- AND t1.TestID IN (SELECT TestID FROM #Ancestors)
--END
--FOR TESTING:
--SELECT TestID AS [Ancestors] FROM #Ancestors;SELECT TestID AS [Descendants] FROM #Descendants;
IF EXISTS (
SELECT *
FROM #Descendants
WHERE TestID=#TestID_Parent
)
BEGIN
SET #RetVal=1
END
RETURN #RetVal
END
GO
ALTER TABLE dbo.Test
ADD CONSTRAINT chkTest_PreventCircularRef
CHECK (dbo.Test_PreventCircularRef(TestID,TestID_Parent) = 0);
GO
SELECT * FROM dbo.Test
--This is problematic as it creates a circular reference between TestID 3 and 4; it is now prevented
UPDATE dbo.Test SET TestID_Parent=4 WHERE TestID=3

Dealing with self-referencing tables / recursive relationships in SQL is not simple. I suppose this is evidenced by the fact that multiple people can't get their heads around the problem with just checking for single-depth cycles.
To enforce this with table constraints, you would need a check constraint based on a recursive query. At best that's DBMS-specific support, and it may not perform well if it has to be run on every update.
My advice is to have the code containing the UPDATE statement enforce this. That could take a couple of forms. In any case if it needs to be strictly enforced it may require limiting UPDATE access into the table to a service account used by a stored proc or external service.
Using a stored procedure would be vary similar to a CHECK constraint, except that you could use procedural (iterative) logic to look for cycles before doing the update. It has become unpopular to put too much logic in stored procs, though, and whether this type of check should be done is a judgement call from team to team / organization to organization.
Likewise using a service-based approach would let you use procedural logic to look for cycles, and you could write it in a language better suited to such logic. The issue here is, if services aren't part of your architecture then it's a bit heavy-weight to introduce a whole new layer. But, a service layer is probably considered more modern/popular (at the moment at least) than funneling updates through stored procs.
With those approaches in mind - and understanding that both procedural and recursive syntax in databases is DBMS-specific - there are too many possible syntax options to really go into. But the idea is:
Examine the proposed parent.
Check it's parent recursively
Do you ever reach the proposed child before reaching a top-level account? IF not, allow the update

Finally, I have created the scripts after some failures, its working fine for me.
-- To hold the Account table data
Declare #Accounts table (ID INT, ParentAccountID INT)
-- To be updated
Declare #AccountID int = 4;
Declare #ParentAccountID int = 7;
Declare #NextParentAccountID INT = #ParentAccountID
Declare #IsCircular int = 0
INSERT INTO #Accounts values (1, NULL), (2,1), (3,1) ,(4,3), (5,4), (6,5), (7,6), (8,7)
-- No circular reference value
--Select * from #Accounts
-- Request to update ParentAccountID to 7 for the Account ID 4
update #Accounts
set ParentAccountID = #ParentAccountID
where ID = #AccountID
Select * from #Accounts
WHILE(1=1)
BEGIN
-- Take the ParentAccountID for #NextParentAccountID
SELECT #NextParentAccountID = ParentAccountID from #Accounts WHERE ID = #NextParentAccountID
-- If the #NextParentAccountID is NULL, then it reaches the top level account, no circular reference hence break the loop
IF (#NextParentAccountID IS NULL)
BEGIN
BREAK;
END
-- If the #NextParentAccountID is equal to #AccountID (to which the update was done) then its creating circular reference
-- Then set the #IsCircular to 1 and break the loop
IF (#NextParentAccountID = #AccountID )
BEGIN
SET #IsCircular = 1
BREAK
END
END
IF #IsCircular = 1
BEGIN
select 'CircularReference' as 'ResponseCode'
END

Related

Constrain Sum(Column) to 1 by some group ID

I have a table that I'm trying to make sure that an aggregate sum of the inserts adds up to 1 (it's a mixture).
I want to constrain it so the whole FKID =2 fails because it adds up to 1.1.
Currently my constraint is
FUNCTION[dbo].[CheckSumTarget](#ID bigint)
RETURNS bit
AS BEGIN
DECLARE #Res BIT
SELECT #Res = Count(1)
FROM dbo.Test AS t
WHERE t.FKID = #ID
GROUP BY t.FKID
HAVING Sum([t.Value])<>1
RETURN #Res
END
GO
ALTER TABLE dbo.Test WITH CHECK ADD CONSTRAINT [CK_Target_Sum] CHECK (([dbo].[CheckSumTarget]([FKID])<>(1)))
but it's failing on the first insert because it doesn't add up to 1 yet. I was hoping if I add them all simultaneously, that wouldn't be the case.
This approach seems fraught with problems.
I would suggest another approach, starting with two tables:
aggregates, so "fkid" should really be aggregate_id
components
Then, in aggregates accumulate the sum() of the component values using a trigger. Maintain another flag that is computed:
alter table aggregates add is_valid as ( sum_value = 1.0 )
Then, create views on the two tables to only show records where is_valid = 1. For instance:
create view v_aggregates as
select c.*
from aggregates a join
components c
on a.aggregate_id = c.aggregate_id
where a.is_value = 1;
Here is a working version of solution
Here is table DDL
create table dbo.test(
id int,
fkid bigint,
value decimal(4,2)
);
The function definition
CREATE FUNCTION[dbo].[CheckSumTarget](#ID bigint)
RETURNS bit
AS BEGIN
DECLARE #Res decimal(4,2)
SELECT #Res = case when sum(value) > 1 then 1 else 0 end
FROM dbo.Test AS t
WHERE t.FKID = #ID
RETURN #Res
END
And the constraint defintion
ALTER TABLE dbo.Test WITH CHECK ADD CONSTRAINT [CK_Target_Sum] CHECK ([dbo].[CheckSumTarget]([FKID]) <> 1)
In your example
insert into dbo.test values (1, 2, 0.5);
insert into dbo.test values (1, 2, 0.4);
-- The following insert will fail, like you expect
insert into dbo.test values (1, 2, 0.2);
Note: This solution will be broken by UPDATE statement (as pointed out by 'Daniel Brughera') however that is a known behaviour. A better and common approach is use of trigger. You may want to explore that.
Your actual approach will work this way.....
You insert the firts component, the value must be 1
You try to insert a second component, it will be rejected because your sum is already 1
You update the existing component to .85
You insert the next component, the value must be .15
You back to step 2. with the third component
Since your constraint only takes care of the FKID column, it will be possible, and you may think that is working....
But.... if you left the process in step 3. your sum is not equal to 1 and is impossible for the constraint to foresee if you will insert the next value or not, even worst, you can update any value to be greater than 1 and it will be accepted.
If you add the value column to your constraint, it will prevent those updates, but you will never be able to go beyond step 1.
Personally I would't do that, but here you can get an approach
Use the computed column suggested by Gordon on your parent table. With computed columns you will always get the actual value, so, the parent wont be valid until the sum is equal to one
Use this solution to prevent the value to be greater than 1, so, at least you will be sure that any non valid parent is because a component is missing, that can be helpful for your business layer
As I mentioned in one comment, the rest of the logic belongs to the business and ui layers
Note as you can see the id and value parameters are not used in the function, but I need them to call them when I create the constraint, that way the constraint will validate updates too
CREATE TABLE ttest (id int, fkid int, value float)
go
create FUNCTION [dbo].[CheckSumTarget](#id int, #fkid int, #value float)
RETURNS FLOAT
AS BEGIN
DECLARE #Res float
SELECT #Res = sum(value)
FROM dbo.ttest AS t
WHERE t.FKID = #fkid
RETURN #Res
END
GO
ALTER TABLE dbo.ttest WITH CHECK ADD CONSTRAINT [CK_Target_Sum] CHECK (([dbo].[CheckSumTarget](id,[FKID],value)<=(1.0)))

SQL - Unique key across 2 columns of same table?

I use SQL Server 2016. I have a database table called "Member".
In that table, I have these 3 columns (for the purpose of my question):
idMember [INT - Identity - Primary Key]
memEmail
memEmailPartner
I want to prevent a row to use an email that already exists in the table.
Both email columns are not mandatory, so they can be left blank (NULL).
If I create a new Member:
If not blank, the values entered for "memEmail" and "memEmailPartner" (independently) should not be found in any other rows in columns memEmail nor memEmailPartner.
So if I want to create a row with email (dominic#email.com) I must not find any occurrences of that value in memEmail or memEmailPartner.
If I update an existing Member:
I must not find any occurrences of that value in memEmail or memEmailPartner, with the exception that I am updating the row (idMembre) which already have the value in memEmail or memEmailPartner.
--
From what I read on Google, it should be possible to do something with a Function-Based Check Constraint but I can't make that work.
Anyone have a solution to my problem ?
Thank you.
I may have misunderstood exactly what you were asking but it looks like you want a simple upsert query with IF EXISTS conditions.
DECLARE #emailAddress VARCHAR(255)= 'dominic#email.com', --dummy value
#id INT= 2; --dummy value
IF NOT EXISTS
(
SELECT 1
FROM #Member
WHERE memEmail = #emailAddress
OR memEmailPartner = #emailAddress
)
BEGIN
SELECT 'insert';
END;
ELSE IF EXISTS
(
SELECT 1
FROM #Member
WHERE idMember = #id
)
BEGIN
SELECT 'update';
END;
A trigger is the traditional way of doing doing what you're asking for. Here's a simple demo;
--if object_id('member') is not null drop table member
go
create table member (
idMember INT Identity Primary Key,
memEmail varchar(100),
memEmailPartner varchar(100)
)
go
create trigger trg_member on member after insert, update as
begin
set nocount on
if exists (select 1 from member m join inserted i on i.memEmail = m.memEmail and i.idMember <> m.idMember) or
exists (select 1 from member m join inserted i on i.memEmail = m.memEmailPartner and i.idMember <> m.idMember) or
exists (select 1 from member m join inserted i on i.memEmailPartner = m.memEmail and i.idMember <> m.idMember) or
exists (select 1 from member m join inserted i on i.memEmailPartner = m.memEmailPartner and i.idMember <> m.idMember)
begin
raiserror('Email addresses must be unique.', 16, 1)
rollback
end
end
go
insert member(memEmail, memEmailPartner) values('a#a.com', null), ('b#b.com', null), (null, 'c#c.com'), (null, 'd#d.com')
go
select * from member
insert member(memEmail, memEmailPartner) values('a#a.com', null) -- should fail
go
insert member(memEmail, memEmailPartner) values(null, 'a#a.com') -- should fail
go
insert member(memEmail, memEmailPartner) values('c#c.com', null) -- should fail
go
insert member(memEmail, memEmailPartner) values(null, 'c#c.com') -- should fail
go
insert member(memEmail, memEmailPartner) values('e#e.com', null) -- should work
go
insert member(memEmail, memEmailPartner) values(null, 'f#f.com') -- should work
go
select * from member
-- Make sure updates still work!
update member set memEmail = memEmail, memEmailPartner = memEmailPartner
I've not tested this extensively but it should be enough to get you started if you want to try this approach.
StuartLC notes the potential for the UDF check constraint to fail in set based updates and/or various other conditions, triggers don't have this problem.
Stuart also suggests reconsidering whether this should really be a database constraint or managed through business logic elsewhere. I'm inclined to agree - my gut feel here is that sooner or later you will come across a situation that requires email addresses to be reused, or in some other way not strictly unique.
TL;DR
The wisdom of applying this kind of business rule logic in the database needs to be reconsidered - this check is likely a better candidate for your application, or a stored procedure which acts as an insert gate keeper instead of direct new row inserts into the table.
Ignoring the Warnings
That said, I do believe that what you want is however possible in a constraint UDF, albeit with potentially atrocious performance consequences*1, and likely prone to race conditions in set based updates
Here's a user defined function which applies the unique email logic across both columns. Note that by the time the constraint is checked, that the row is IN the table already, hence the new row itself needs to be excluded from the duplicate checks.
My code also is depedent on ANSI NULL behaviour, i.e. that the predicates NULL = NULL and X IN (NULL) both return NULL, and hence are excluded from the failure check (in order to meet your requirement that NULLS do not fail the rule).
We also need to check for the insert of BOTH new columns being non-null, but duplicated.
So here's the a UDF doing the checking:
CREATE FUNCTION dbo.CheckUniqueEmails(#id int, #memEmail varchar(50),
#memEmailPartner varchar(50))
RETURNS bit
AS
BEGIN
DECLARE #retval bit;
IF #memEmail = #memEmailPartner
OR EXISTS (SELECT 1 FROM MyTable WHERE memEmail IS NOT NULL
AND memEmail IN(#memEmail, #memEmailPartner) AND idMember <> #id)
OR EXISTS (SELECT 1 FROM MyTable WHERE memEmailPartner IS NOT NULL
AND memEmailPartner IN(#memEmail, #memEmailPartner) AND idMember <> #id)
SET #retval = 0
ELSE
SET #retval = 1;
RETURN #retval;
END;
GO
Which is then enforced in a CHECK constraint:
ALTER TABLE MyTable ADD CHECK (dbo.CheckUniqueEmails(
idMember, memEmail, memEmailPartner) = 1);
I've put a SQLFiddle up here
Uncomment the 'failed' test cases to ensure that the above check constraint is working.
I haven't tested this with updates, and as per Martin's advice on the link, this will likely break on an insert with multiple rows.
*1 - we'll need indexes on BOTH email address columns.

Reuse results of SELECT query inside a stored procedure

This is probably a very simple question, but my attempts to search for an answer are thwarted by Google finding answers showing how to reuse a query by making a stored procedure instead. I want to reuse the results of a query inside a stored procedure.
Here's a cut-down example where I've chopped out NOCOUNT, XACT_ABORT, TRANSACTION, TRY, and much of the logic.
CREATE PROCEDURE Do_Something
#userId UNIQUEIDENTIFIER
AS
BEGIN
DELETE FROM LikedItems
WHERE likedItemId IN
(
SELECT Items.id FROM Items
WHERE Items.userId = #userId
)
DELETE FROM FollowedItems
WHERE followedItemId IN
(
SELECT Items.id FROM Items
WHERE Items.userId = #userId
)
END
What is the syntax to reuse the results of the duplicated nested SELECT rather than doing it twice?
You can INSERT result of the SELECT into a temporary table or table variable, but it doesn't automatically mean that the overall performance would be better. You need to measure it.
Temp Table
CREATE PROCEDURE Do_Something
#userId UNIQUEIDENTIFIER
AS
BEGIN
CREATE TABLE #Temp(id int);
INSERT INTO #Temp(id)
SELECT Items.id
FROM Items
WHERE Items.userId = #userId;
DELETE FROM LikedItems
WHERE likedItemId IN
(
SELECT id FROM #Temp
)
DELETE FROM FollowedItems
WHERE followedItemId IN
(
SELECT id FROM #Temp
)
DROP TABLE #Temp;
END
Table variable
CREATE PROCEDURE Do_Something
#userId UNIQUEIDENTIFIER
AS
BEGIN
DECLARE #Temp TABLE(id int);
INSERT INTO #Temp(id)
SELECT Items.id
FROM Items
WHERE Items.userId = #userId;
DELETE FROM LikedItems
WHERE likedItemId IN
(
SELECT id FROM #Temp
)
DELETE FROM FollowedItems
WHERE followedItemId IN
(
SELECT id FROM #Temp
)
END
You can declare a table variable to store the results of the select and then simply query that.
CREATE PROCEDURE Do_Something
#userId UNIQUEIDENTIFIER
AS
BEGIN
DECLARE #TempItems TABLE (id int)
INSERT INTO #TempItems
SELECT Items.id FROM Items
WHERE Items.userId = #userId
DELETE FROM LikedItems
WHERE likedItemId IN
(
SELECT id FROM #TempItems
)
DELETE FROM FollowedItems
WHERE followedItemId IN
(
SELECT id FROM #TempItems
)
END
If the subquery is fast and simple - no need to change anything. Item's data is in the cache (if it was not) after the first query, locks are obtained. If the subquery is slow and complicated - store it into a table variable and reuse by the same subquery as listed in the question.
If your question is not related to performance and you are beware of copy-paste: there is no copy-paste. There is the same logic, similar structure and references - yes, you will have almost the same query source code.
In general, it is not the same. Some rows could be deleted from or inserted into Items table after the first query unless your are running under SERIALIZABLE isolation level. Many different things could happen during first delete, between first and second delete statements. Each delete statement also requires it's own execution plan - thus all the information about tables affected and joins must be provided to SERVER anyway. You need to filter by the same source again - yes, you provide subquery with the same source again. There is no "twice" or "reuse" of a partial code. Data collected by a complicated query - yes, it can be reused (without running the same complicated query - by simple querying from prepared source) via temp tables/table variables as mentioned before.

sql server cannot access inserted table in a trigger

I am trying to create a simple to insert trigger that gets the count from a table and adds it to another like this
CREATE TABLE [poll-count](
id VARCHAR(100),
altid BIGINT,
option_order BIGINT,
uip VARCHAR(50),
[uid] VARCHAR(100),
[order] BIGINT
PRIMARY KEY NONCLUSTERED([order]),
FOREIGN KEY ([order]) references ord ([order]
)
GO
CREATE TRIGGER [get-poll-count]
ON [poll-count]
FOR INSERT
AS
BEGIN
DECLARE #count INT
SET #count = (SELECT COUNT (*) FROM [poll-count] WHERE option_order = i.option_order)
UPDATE [poll-options] SET [total] = #count WHERE [order] = i.option_order
END
GO
when i ever i try to run this i get this error:
The multi-part identifier "i.option_order" could not be bound
what is the problem?
thanks
Your trigger currently assumes that there will always be one-row inserts. Have you tried your trigger with anything like this?
INSERT dbo.[poll-options](option_order --, ...)
VALUES(1 --, ...),
(2 --, ...);
Also, you say that SQL Server "cannot access inserted table" - yet your statement says this. Where do you reference inserted (even if this were a valid subquery structure)?
SET #count = (SELECT COUNT (*) FROM [poll-count]
WHERE option_order = i.option_order)
-----------------------^ "i" <> "inserted"
Here is a trigger that properly references inserted and also properly handles multi-row inserts:
CREATE TRIGGER dbo.pollupdate
ON dbo.[poll-options]
FOR INSERT
AS
BEGIN
SET NOCOUNT ON;
;WITH x AS
(
SELECT option_order, c = COUNT(*)
FROM dbo.[poll-options] AS p
WHERE EXISTS
(
SELECT 1 FROM inserted
WHERE option_order = p.option_order
)
GROUP BY option_order
)
UPDATE p SET total = x.c
FROM dbo.[poll-options] AS p
INNER JOIN x
ON p.option_order = x.option_order;
END
GO
However, why do you want to store this data on every row? You can always derive the count at runtime, know that it is perfectly up to date, and avoid the need for a trigger altogether. If it's about the performance aspect of deriving the count at runtime, a much easier way to implement this write-ahead optimization for about the same maintenance cost during DML is to create an indexed view:
CREATE VIEW dbo.[poll-options-count]
WITH SCHEMABINDING
AS
SELECT option_order, c = COUNT_BIG(*)
FROM dbo.[poll-options]
GROUP BY option_order;
GO
CREATE UNIQUE CLUSTERED INDEX oo ON dbo.[poll-options-count](option_order);
GO
Now the index is maintained for you and you can derive very quick counts for any given (or all) option_order values. You'll have test, of course, whether the improvement in query time is worth the increased maintenance (though you are already paying that price with the trigger, except that it can affect many more rows in any given insert, so...).
As a final suggestion, don't use special characters like - in object names. It just forces you to always wrap it in [square brackets] and that's no fun for anyone.

Does anyone know a neat trick for reusing identity values?

Typically when you specify an identity column you get a convenient interface in SQL Server for asking for particular row.
SELECT * FROM $IDENTITY = #pID
You don't really need to concern yourself with the name if the identity column because there can only be one.
But what if I have a table which mostly consists of temporary data. Lots of inserts and lots of deletes. Is there a simple way for me to reuse the identity values.
Preferably I would want to be able to write a function that would return say NEXT_SMALLEST($IDENTITY) as next identity value and do so in a fail-safe manner.
Basically find the smallest value that's not in use. That's not entirely trivial to do, but what I want is to be able to tell SQL Server that this is my function that will generate the identity values. But what I know is that no such function exists...
I want to...
Implement global data base IDs, I need to provide a default value that I'm in control of.
My idea was based around that I should be able to have a table with all known IDs and then every row ID from some other table that needed a global ID would reference that table. The default value would be provided by something like
INSERT INTO GlobalID
RETURN SCOPE_IDENTITY()
No; it's not unique if it can be reused.
Why do you want to re-use them? Why do you concern yourself with this field? If you want to be in control of it, don't make it an identity; create your own scheme and use that.
Don't reuse identities, you'll just shoot your self in the foot. Use a large enough value so that it never rolls over (64 bit big int).
To find missing gaps in a sequence of numbers join the table against itself with a +/- 1 difference:
SELECT a.id
FROM table AS a
LEFT OUTER JOIN table AS b ON a.id = b.id+1
WHERE b.id IS NULL;
This query will find the numbers in the id sequence for which id-1 is not in the table, ie. contiguous sequence start numbers. You can then use SET IDENTITY INSERT OFF to insert a specific id and reuse a number. The cost of doing so is overwhelming (both runtime and code complexity) compared with the an ordinary identity based insert.
If you really want to reset Identity value to the lowest,
here is the trick you can use through DBCC CHECKIDENT
Basically following sql statements resets identity value so that identity value restarts from the lowest possible number
create table TT (id int identity(1, 1))
GO
insert TT default values
GO 10
select * from TT
GO
delete TT where id between 5 and 10
GO
--; At this point, next ID will be 11, not 5
select * from TT
GO
insert TT default values
GO
--; as you can see here, next ID is indeed 11
select * from TT
GO
--; Now delete ID = 11
--; so that we can reseed next highest ID to 5
delete TT where id = 11
GO
--; Now, let''s reseed identity value to the lowest possible identity number
declare #seedID int
select #seedID = max(id) from TT
print #seedID --; 4
--; We reseed identity column with "DBCC CheckIdent" and pass a new seed value
--; But we can't pass a seed number as argument, so let's use dynamic sql.
declare #sql nvarchar(200)
set #sql = 'dbcc checkident(TT, reseed, ' + cast(#seedID as varchar) + ')'
exec sp_sqlexec #sql
GO
--; Now the next
insert TT default values
GO
--; as you can see here, next ID is indeed 5
select * from TT
GO
I guess we would really need to know why you want to reuse your identity column. The only reason I can think of is because of the temporary nature of your data you might exhaust the possible values for the identity. That is not really likely, but if that is your concern, you can use uniqueidentifiers (guids) as the primary key in your table instead.
The function newid() will create a new guid and can be used in insert statements (or other statements). Then when you delete the row, you don't have any "holes" in your key because guids are not created in that order anyway.
[Syntax assumes SQL2008....]
Yes, it's possible. You need to two management tables, and two triggers on each participating table.
First, the management tables:
-- this table should only ever have one row
CREATE TABLE NextId (Id INT)
INSERT NextId VALUES (1)
GO
CREATE TABLE RecoveredIds (Id INT NOT NULL PRIMARY KEY)
GO
Then, the triggers, two on each table:
CREATE TRIGGER tr_TableName_RecoverId ON TableName
FOR DELETE AS BEGIN
IF ##ROWCOUNT = 0 RETURN
INSERT RecoveredIds (Id) SELECT Id FROM deleted
END
GO
CREATE TRIGGER tr_TableName_AssignId ON TableName
INSTEAD OF INSERT AS BEGIN
DECLARE #rowcount INT = ##ROWCOUNT
IF #rowcount = 0 RETURN
DECLARE #required INT = #rowcount
DECLARE #new_ids TABLE (Id INT PRIMARY KEY)
DELETE TOP (#required) OUTPUT DELETED.Id INTO #new_ids (Id) FROM RecoveredIds
SET #rowcount = ##ROWCOUNT
IF #rowcount < #required BEGIN
DECLARE #output TABLE (Id INT)
UPDATE NextId SET Id = Id + (#required-#rowcount)
OUTPUT DELETED.Id INTO #output
-- this assumes you have a numbers table around somewhere
INSERT #new_ids (Id)
SELECT n.Number+o.Id-1 FROM Numbers n, #output o
WHERE n.Number BETWEEN 1 AND #required-#rowcount
END
SET IDENTITY_INSERT TableName ON
;WITH inserted_CTE AS (SELECT _no = ROW_NUMBER() OVER (ORDER BY Id), * FROM inserted)
, new_ids_CTE AS (SELECT _no = ROW_NUMBER() OVER (ORDER BY Id), * FROM #new_ids)
INSERT TableName (Id, Attr1, Attr2)
SELECT n.Id, i.Attr1, i.Attr2
FROM inserted_CTE i JOIN new_ids_CTE n ON i._no = n._no
SET IDENTITY_INSERT TableName OFF
END
You could script the triggers out easily enough from system tables.
You would want to test this for concurrency. It should work as is, syntax errors notwithstanding: The OUTPUT clause guarantees atomicity of id lookup->increment as one step, and the entire operation occurs within a transaction, thanks to the trigger.
TableName.Id is still an identity column. All the common idioms like $IDENTITY and SCOPE_IDENTITY() will still work.
There is no central table of ids by table, but you could create one easily enough.
I don't have any help for finding the values not in use but if you really want to find them and set them yourself, you can use
set identity_insert on ....
in your code to do so.
I'm with everyone else though. Why bother? Don't you have a business problem to solve?