Fastest way to return a primary key value SQL Server 2005 - sql

I have a two column table with a primary key (int) and a unique value (nvarchar(255))
When I insert a value to this table, I can use Scope_identity() to return the primary key for the value I just inserted. However, if the value already exists, I have to perform an additional select to return the primary key for a follow up operation (inserting that primary key into a second table)
I'm thinking there must be a better way to do this - I considered using covered indexes but the table only has two columns, most of what I've read on covered indexes suggests they only help where the table is significantly larger than the index.
Is there any faster way to do this? Would a covered index be faster even if its the same size as the table?

Building an index won't gain you anything since you have already created your value column as unique (which builds a index in the background). Effectively a full table scan is no different from an index scan in your scenario.
I assume you want to have a sort of insert-if-not-already-existsts behaviour. There is no way getting around a second select
if not exists (select ID from where name = #...)
insert into ...
select SCOPE_IDENTITY()
else
(select ID from where name = #...)
If the value happens to exist, the query will usually have been cached, so there should be no performance hit for the second ID select.

[Update statment here]
IF (##ROWCOUNT = 0)
BEGIN
[Insert statment here]
SELECT Scope_Identity()
END
ELSE
BEGIN
[SELECT id statment here]
END
I don't know about performance but it has no big overhead

As has already been mentioned this really shouldn't be a slow operation, especially if you index both columns. However if you are determined to reduce the expense of this operation then I see no reason why you couldn't remove the table entirely and just use the unique value directly rather than looking it up in this table. A 1-1 mapping like this is (theoretically) redundant. I say theoretically because there may be performance implications to using an nvarchar instead of an int.

I'll post this answer since everyone else seems to say you have to query the table twice in the event that the record exists... that's not true.
Step 1) Create a unique-index on the other column:
I recommend this as the index:
-- We're including the "ID" column so that SQL will not have to look far once the "WHERE" clause is finished.
CREATE INDEX MyLilIndex ON dbo.MyTable (Column2) INCLUDE (ID)
Step 2)
DECLARE #TheID INT
SELECT #TheID = ID from MyTable WHERE Column2 = 'blah blah'
IF (#TheID IS NOT NULL)
BEGIN
-- See, you don't have to query the table twice!
SELECT #TheID AS TheIDYouWanted
END
ELSE
INSERT...
SELECT SCOPE_IDENTITY() AS TheIDYouWanted

Create a unique index for the second entry, then:
if not exists (select null from ...)
insert into ...
else
select x from ...
You can't get away from the index, and it isn't really much overhead -- SQL server supports index columns upto 900-bytes, and does not discriminate.
The needs of your model are more important than any perceived performance issues, symbolising a string (which is what you are doing) is a common method to reduce database size, and this indirectly (and generally) means better performance.
-- edit --
To appease timothy :
declare #x int = select x from ...
if (#x is not null)
return x
else
...

You could use OUTPUT clause to return the value in the same statement. Here is an example.
DDL:
CREATE TABLE ##t (
id int PRIMARY KEY IDENTITY(1,1),
val varchar(255) NOT NULL
)
GO
-- no need for INCLUDE as PK column is always included in the index
CREATE UNIQUE INDEX AK_t_val ON ##t (val)
DML:
DECLARE #id int, #val varchar(255)
SET #val = 'test' -- or whatever you need here
SELECT #id = id FROM ##t WHERE val = #val
IF (#id IS NULL)
BEGIN
DECLARE #new TABLE (id int)
INSERT INTO ##t (val)
OUTPUT inserted.id INTO #new -- put new ID into table variable immediately
VALUES (#val)
SELECT #id = id FROM #new
END
PRINT #id

Related

Constrain Sum(Column) to 1 by some group ID

I have a table that I'm trying to make sure that an aggregate sum of the inserts adds up to 1 (it's a mixture).
I want to constrain it so the whole FKID =2 fails because it adds up to 1.1.
Currently my constraint is
FUNCTION[dbo].[CheckSumTarget](#ID bigint)
RETURNS bit
AS BEGIN
DECLARE #Res BIT
SELECT #Res = Count(1)
FROM dbo.Test AS t
WHERE t.FKID = #ID
GROUP BY t.FKID
HAVING Sum([t.Value])<>1
RETURN #Res
END
GO
ALTER TABLE dbo.Test WITH CHECK ADD CONSTRAINT [CK_Target_Sum] CHECK (([dbo].[CheckSumTarget]([FKID])<>(1)))
but it's failing on the first insert because it doesn't add up to 1 yet. I was hoping if I add them all simultaneously, that wouldn't be the case.
This approach seems fraught with problems.
I would suggest another approach, starting with two tables:
aggregates, so "fkid" should really be aggregate_id
components
Then, in aggregates accumulate the sum() of the component values using a trigger. Maintain another flag that is computed:
alter table aggregates add is_valid as ( sum_value = 1.0 )
Then, create views on the two tables to only show records where is_valid = 1. For instance:
create view v_aggregates as
select c.*
from aggregates a join
components c
on a.aggregate_id = c.aggregate_id
where a.is_value = 1;
Here is a working version of solution
Here is table DDL
create table dbo.test(
id int,
fkid bigint,
value decimal(4,2)
);
The function definition
CREATE FUNCTION[dbo].[CheckSumTarget](#ID bigint)
RETURNS bit
AS BEGIN
DECLARE #Res decimal(4,2)
SELECT #Res = case when sum(value) > 1 then 1 else 0 end
FROM dbo.Test AS t
WHERE t.FKID = #ID
RETURN #Res
END
And the constraint defintion
ALTER TABLE dbo.Test WITH CHECK ADD CONSTRAINT [CK_Target_Sum] CHECK ([dbo].[CheckSumTarget]([FKID]) <> 1)
In your example
insert into dbo.test values (1, 2, 0.5);
insert into dbo.test values (1, 2, 0.4);
-- The following insert will fail, like you expect
insert into dbo.test values (1, 2, 0.2);
Note: This solution will be broken by UPDATE statement (as pointed out by 'Daniel Brughera') however that is a known behaviour. A better and common approach is use of trigger. You may want to explore that.
Your actual approach will work this way.....
You insert the firts component, the value must be 1
You try to insert a second component, it will be rejected because your sum is already 1
You update the existing component to .85
You insert the next component, the value must be .15
You back to step 2. with the third component
Since your constraint only takes care of the FKID column, it will be possible, and you may think that is working....
But.... if you left the process in step 3. your sum is not equal to 1 and is impossible for the constraint to foresee if you will insert the next value or not, even worst, you can update any value to be greater than 1 and it will be accepted.
If you add the value column to your constraint, it will prevent those updates, but you will never be able to go beyond step 1.
Personally I would't do that, but here you can get an approach
Use the computed column suggested by Gordon on your parent table. With computed columns you will always get the actual value, so, the parent wont be valid until the sum is equal to one
Use this solution to prevent the value to be greater than 1, so, at least you will be sure that any non valid parent is because a component is missing, that can be helpful for your business layer
As I mentioned in one comment, the rest of the logic belongs to the business and ui layers
Note as you can see the id and value parameters are not used in the function, but I need them to call them when I create the constraint, that way the constraint will validate updates too
CREATE TABLE ttest (id int, fkid int, value float)
go
create FUNCTION [dbo].[CheckSumTarget](#id int, #fkid int, #value float)
RETURNS FLOAT
AS BEGIN
DECLARE #Res float
SELECT #Res = sum(value)
FROM dbo.ttest AS t
WHERE t.FKID = #fkid
RETURN #Res
END
GO
ALTER TABLE dbo.ttest WITH CHECK ADD CONSTRAINT [CK_Target_Sum] CHECK (([dbo].[CheckSumTarget](id,[FKID],value)<=(1.0)))

Generating the Next Id when Id is non-AutoNumber

I have a table called Employee. The EmpId column serves as the primary key. In my scenario, I cannot make it AutoNumber.
What would be the best way of generating the the next EmpId for the new row that I want to insert in the table?
I am using SQL Server 2008 with C#.
Here is the code that i am currently getting, but to enter Id's in key value pair tables or link tables (m*n relations)
Create PROCEDURE [dbo].[mSP_GetNEXTID]
#NEXTID int out,
#TABLENAME varchar(100),
#UPDATE CHAR(1) = NULL
AS
BEGIN
DECLARE #QUERY VARCHAR(500)
BEGIN
IF EXISTS (SELECT LASTID FROM LASTIDS WHERE TABLENAME = #TABLENAME and active=1)
BEGIN
SELECT #NEXTID = LASTID FROM LASTIDS WHERE TABLENAME = #TABLENAME and active=1
IF(#UPDATE IS NULL OR #UPDATE = '')
BEGIN
UPDATE LASTIDS
SET LASTID = LASTID + 1
WHERE TABLENAME = #TABLENAME
and active=1
END
END
ELSE
BEGIN
SET #NEXTID = 1
INSERT INTO LASTIDS(LASTID,TABLENAME, ACTIVE)
VALUES(#NEXTID+1,#TABLENAME, 1)
END
END
END
Using MAX(id) + 1 is a bad idea both performance and concurrency wise.
Instead you should resort to sequences which were design specifically for this kind of problem.
CREATE SEQUENCE EmpIdSeq AS bigint
START WITH 1
INCREMENT BY 1;
And to generate the next id use:
SELECT NEXT VALUE FOR EmpIdSeq;
You can use the generated value in a insert statement:
INSERT Emp (EmpId, X, Y)
VALUES (NEXT VALUE FOR EmpIdSeq, 'x', 'y');
And even use it as default for your column:
CREATE TABLE Emp
(
EmpId bigint PRIMARY KEY CLUSTERED
DEFAULT (NEXT VALUE FOR EmpIdSeq),
X nvarchar(255) NULL,
Y nvarchar(255) NULL
);
Update: The above solution is only applicable to SQL Server 2012+. For older versions you can simulate the sequence behavior using dummy tables with identity fields:
CREATE TABLE EmpIdSeq (
SeqID bigint IDENTITY PRIMARY KEY CLUSTERED
);
And procedures that emulates NEXT VALUE:
CREATE PROCEDURE GetNewSeqVal_Emp
#NewSeqVal bigint OUTPUT
AS
BEGIN
SET NOCOUNT ON
INSERT EmpIdSeq DEFAULT VALUES
SET #NewSeqVal = scope_identity()
DELETE FROM EmpIdSeq WITH (READPAST)
END;
Usage exemple:
DECLARE #NewSeqVal bigint
EXEC GetNewSeqVal_Emp #NewSeqVal OUTPUT
The performance overhead of deleting the last inserted element will be minimal; still, as pointed out by the original author, you can optionally remove the delete statement and schedule a maintenance job to delete the table contents off-hour (trading space for performance).
Adapted from SQL Server Customer Advisory Team Blog.
Working SQL Fiddle
The above
select max(empid) + 1 from employee
is the way to get the next number, but if there are multiple user inserting into the database, then context switching might cause two users to get the same value for empid and then add 1 to each and then end up with repeat ids. If you do have multiple users, you may have to lock the table while inserting. This is not the best practice and that is why the auto increment exists for database tables.
I hope this works for you. Considering that your ID field is an integer
INSERT INTO Table WITH (TABLOCK)
(SELECT CASE WHEN MAX(ID) IS NULL
THEN 1 ELSE MAX(ID)+1 END FROM Table), VALUE_1, VALUE_2....
Try following query
INSERT INTO Table VALUES
((SELECT isnull(MAX(ID),0)+1 FROM Table), VALUE_1, VALUE_2....)
you have to check isnull in on max values otherwise it will return null in final result when table contain no rows .

Checking sql unique value with constraint

I have a situation where a table has three columns ID, Value and status. For a distinct ID there should be only one status with value 1 and it should be allowed for ID to have more then one status with value 0. Unique key would prevent ID of having more then one status (0 or 1).
Is there a way to solve this, maybe using constraints?
Thanks
You can create an indexed view that will uphold your constraint of keeping ID unique for [Status] = 1.
create view dbo.v_YourTable with schemabinding as
select ID
from dbo.YourTable
where [Status] = 1
go
create unique clustered index UX_v_UniTest_ID on v_YourTable(ID)
In SQL Server 2008 you could use a unique filtered index instead.
If the table can have duplicate ID values, then a check constraint wouldn't work for your situation. I think the only way would be to use a trigger. If you are looking for an example then I can post one. But in summary, use a trigger to test if the inserted/updated ID has a status of 1 that is duplicated across the same ID.
EDIT: You could always use a unique constraint on ID and Value. I'm thinking that will give you what you are looking for.
You could put this into an insert/ update trigger to check to make sure only one combination exists with the 1 value; if your condition is not met, you could throw a trappable error and force the operation to roll back.
If you can use NULL instead of 0 for a zero-status, then you can use a UNIQUE constraint on the pair and it should work. Since NULL is not an actual value (NULL != NULL), then rows with multiple nulls should not conflict.
IMHO, this basically is a normalisation problem. The column named "id" does not uniquely address a row, so it can never be a PK. At least a new (surrogate) key(element) is needed. The constraint itself cannot be expressed as an expression "within the row", so it has to be expressed in terms of a FK.
So it breaks down into two tables:
One with PK=id, and a FK REFERENCING two.sid
Two with PK= surrogate key, and FK id REFERENCING one.id
The original payload "value" also lives here.
The "one bit variable" disappears, because it can be expressed in terms of EXISTS. (effectively table one points to the row that holds the token)
[I expect the Postgres rule system could be used to use the above two-tables-model to emulate the intended behaviour of the OP. But that would be an ugly hack...]
EDIT/UPDATE:
Postgres supports partial/conditional indices. (don't know about ms-sql)
DROP TABLE tmp.one;
CREATE TABLE tmp.one
( sid INTEGER NOT NULL PRIMARY KEY -- surrogate key
, id INTEGER NOT NULL
, status INTEGER NOT NULL DEFAULT '0'
/* ... payload */
);
INSERT INTO tmp.one(sid,id,status) VALUES
(1,1,0) , (2,1,1) , (3,1,0)
, (4,2,0) , (5,2,0) , (6,2,1)
, (7,3,0) , (8,3,0) , (9,3,1)
;
CREATE UNIQUE INDEX only_one_non_zero ON tmp.one (id)
WHERE status > 0 -- "partial index"
;
\echo this should succeed
BEGIN ;
UPDATE tmp.one SET status = 0 WHERE sid=2;
UPDATE tmp.one SET status = 1 WHERE sid=1;
COMMIT;
\echo this should fail
BEGIN ;
UPDATE tmp.one SET status = 1 WHERE sid=4;
UPDATE tmp.one SET status = 0 WHERE sid=9;
COMMIT;
SELECT * FROM tmp.one ORDER BY sid;
I came up with a solution
First create a function
CREATE FUNCTION [dbo].[Check_Status] (#ID int)
RETURNS INT
AS
BEGIN
DECLARE #r INT;
SET #r =
(SELECT SUM(status) FROM dbo.table where ID= #ID);
RETURN #r;
END
Second create a constraint in table
([dbo].[Check_Status]([ID])<(2))
In this way one ID could have single status (1) and as many as possible status (0).
create function dbo.IsValueUnique
(
#proposedValue varchar(50)
,#currentId int
)
RETURNS bit
AS
/*
--EXAMPLE
print dbo.IsValueUnique() -- fail
print dbo.IsValueUnique(null) -- fail
print dbo.IsValueUnique(null,1) -- pass
print dbo.IsValueUnique('Friendly',1) -- pass
*/
BEGIN
DECLARE #count bit
set #count =
(
select count(1)
from dbo.MyTable
where #proposedValue is not null
and dbo.MyTable.MyPkColumn != #currentId
and dbo.MyTable.MyColumn = #proposedValue
)
RETURN case when #count = 0 then 1 else 0 end
END
GO
ALTER TABLE MyTable
WITH CHECK
add constraint CK_ColumnValueIsNullOrUnique
CHECK ( 1 = dbo.IsValueNullOrUnique([MyColumn],[MyPkColumn]) )
GO

Does anyone know a neat trick for reusing identity values?

Typically when you specify an identity column you get a convenient interface in SQL Server for asking for particular row.
SELECT * FROM $IDENTITY = #pID
You don't really need to concern yourself with the name if the identity column because there can only be one.
But what if I have a table which mostly consists of temporary data. Lots of inserts and lots of deletes. Is there a simple way for me to reuse the identity values.
Preferably I would want to be able to write a function that would return say NEXT_SMALLEST($IDENTITY) as next identity value and do so in a fail-safe manner.
Basically find the smallest value that's not in use. That's not entirely trivial to do, but what I want is to be able to tell SQL Server that this is my function that will generate the identity values. But what I know is that no such function exists...
I want to...
Implement global data base IDs, I need to provide a default value that I'm in control of.
My idea was based around that I should be able to have a table with all known IDs and then every row ID from some other table that needed a global ID would reference that table. The default value would be provided by something like
INSERT INTO GlobalID
RETURN SCOPE_IDENTITY()
No; it's not unique if it can be reused.
Why do you want to re-use them? Why do you concern yourself with this field? If you want to be in control of it, don't make it an identity; create your own scheme and use that.
Don't reuse identities, you'll just shoot your self in the foot. Use a large enough value so that it never rolls over (64 bit big int).
To find missing gaps in a sequence of numbers join the table against itself with a +/- 1 difference:
SELECT a.id
FROM table AS a
LEFT OUTER JOIN table AS b ON a.id = b.id+1
WHERE b.id IS NULL;
This query will find the numbers in the id sequence for which id-1 is not in the table, ie. contiguous sequence start numbers. You can then use SET IDENTITY INSERT OFF to insert a specific id and reuse a number. The cost of doing so is overwhelming (both runtime and code complexity) compared with the an ordinary identity based insert.
If you really want to reset Identity value to the lowest,
here is the trick you can use through DBCC CHECKIDENT
Basically following sql statements resets identity value so that identity value restarts from the lowest possible number
create table TT (id int identity(1, 1))
GO
insert TT default values
GO 10
select * from TT
GO
delete TT where id between 5 and 10
GO
--; At this point, next ID will be 11, not 5
select * from TT
GO
insert TT default values
GO
--; as you can see here, next ID is indeed 11
select * from TT
GO
--; Now delete ID = 11
--; so that we can reseed next highest ID to 5
delete TT where id = 11
GO
--; Now, let''s reseed identity value to the lowest possible identity number
declare #seedID int
select #seedID = max(id) from TT
print #seedID --; 4
--; We reseed identity column with "DBCC CheckIdent" and pass a new seed value
--; But we can't pass a seed number as argument, so let's use dynamic sql.
declare #sql nvarchar(200)
set #sql = 'dbcc checkident(TT, reseed, ' + cast(#seedID as varchar) + ')'
exec sp_sqlexec #sql
GO
--; Now the next
insert TT default values
GO
--; as you can see here, next ID is indeed 5
select * from TT
GO
I guess we would really need to know why you want to reuse your identity column. The only reason I can think of is because of the temporary nature of your data you might exhaust the possible values for the identity. That is not really likely, but if that is your concern, you can use uniqueidentifiers (guids) as the primary key in your table instead.
The function newid() will create a new guid and can be used in insert statements (or other statements). Then when you delete the row, you don't have any "holes" in your key because guids are not created in that order anyway.
[Syntax assumes SQL2008....]
Yes, it's possible. You need to two management tables, and two triggers on each participating table.
First, the management tables:
-- this table should only ever have one row
CREATE TABLE NextId (Id INT)
INSERT NextId VALUES (1)
GO
CREATE TABLE RecoveredIds (Id INT NOT NULL PRIMARY KEY)
GO
Then, the triggers, two on each table:
CREATE TRIGGER tr_TableName_RecoverId ON TableName
FOR DELETE AS BEGIN
IF ##ROWCOUNT = 0 RETURN
INSERT RecoveredIds (Id) SELECT Id FROM deleted
END
GO
CREATE TRIGGER tr_TableName_AssignId ON TableName
INSTEAD OF INSERT AS BEGIN
DECLARE #rowcount INT = ##ROWCOUNT
IF #rowcount = 0 RETURN
DECLARE #required INT = #rowcount
DECLARE #new_ids TABLE (Id INT PRIMARY KEY)
DELETE TOP (#required) OUTPUT DELETED.Id INTO #new_ids (Id) FROM RecoveredIds
SET #rowcount = ##ROWCOUNT
IF #rowcount < #required BEGIN
DECLARE #output TABLE (Id INT)
UPDATE NextId SET Id = Id + (#required-#rowcount)
OUTPUT DELETED.Id INTO #output
-- this assumes you have a numbers table around somewhere
INSERT #new_ids (Id)
SELECT n.Number+o.Id-1 FROM Numbers n, #output o
WHERE n.Number BETWEEN 1 AND #required-#rowcount
END
SET IDENTITY_INSERT TableName ON
;WITH inserted_CTE AS (SELECT _no = ROW_NUMBER() OVER (ORDER BY Id), * FROM inserted)
, new_ids_CTE AS (SELECT _no = ROW_NUMBER() OVER (ORDER BY Id), * FROM #new_ids)
INSERT TableName (Id, Attr1, Attr2)
SELECT n.Id, i.Attr1, i.Attr2
FROM inserted_CTE i JOIN new_ids_CTE n ON i._no = n._no
SET IDENTITY_INSERT TableName OFF
END
You could script the triggers out easily enough from system tables.
You would want to test this for concurrency. It should work as is, syntax errors notwithstanding: The OUTPUT clause guarantees atomicity of id lookup->increment as one step, and the entire operation occurs within a transaction, thanks to the trigger.
TableName.Id is still an identity column. All the common idioms like $IDENTITY and SCOPE_IDENTITY() will still work.
There is no central table of ids by table, but you could create one easily enough.
I don't have any help for finding the values not in use but if you really want to find them and set them yourself, you can use
set identity_insert on ....
in your code to do so.
I'm with everyone else though. Why bother? Don't you have a business problem to solve?

Sql Server INSERT scope problem

This may have been asked before, but it's really hard to search for terms that limit the search results...
Take the following SQL snippet:
declare #source table (id int)
declare #target table(id int primary key, sourceId int)
set nocount on
insert into #target values (0,0)
insert into #source(id) values(1)
--insert into #source(id) values(2)
set nocount off
insert into #target select (select max(id)+1 from #target), s.id from #source s
select * from #target
This obviously executes without error, but now uncomment the second insert line and the following error occurs:
Msg 2627, Level 14, State 1, Line 15
Violation of PRIMARY KEY constraint 'PK__#7DB3CB72__7EA7EFAB'. Cannot insert duplicate key in object 'dbo.#target'.
I realise that the insert statement more than likely is effected against a snapshot of the #target table so (select max(id)+1 from #target) will always return a value of 1 - causing the violation error above...
Is there any way around this apart from resorting to a cursor?
Change your insert statement to the following:
insert into #target select (select
max(id) from #target) + (ROW_NUMBER()
OVER(ORDER BY s.id)), s.id from
#source s
This should work for this specific case but I would be careful about generalizing it.
You could use an identity column (that's exactly what they are meant for)
declare #target table(id int IDENTITY(1,1), sourceId int)
If your problem is that the select clause is "computed" before the insert is executed, there's afaik no way around this using a single SQL request
I think it's by design ; For your insertion to avoid duplicates, the index id must be computed during the insert, not during the select. This is the exact purpose of the IDENTITY keyword.
If you want to insert one select at a time, you must write separate requests (using cursors for example, but you'll lose atomicity, and will have to use proper locking keywords to avoid race conditions)
The way you're determining your new PK value, is a race condition waiting to happen.
If your DB is under high load, and multiple records are being inserted at the same time, you're going to get unexpected results.
Why don't you just use an identity column , and let the database handle the assignment of a new primary Id ?
Or, you can create some kind of meta-table, which holds a record for every table in your database, and this record contains the next value that should be used as a primary id in the table.
Then, you must make sure that every time you create a new record, you also update the next-value in your meta-table (and you should make sure that you do the appropriate locking), but, I see no added value in this approach vs making use of identity columns.