Arithmetic overflow on large table

Arithmetic overflow on large table - sql

I have a table with 5 billions of rows in SQL Server 2014 (Developer Edition, x64, Windows 10 Pro x64):
CREATE TABLE TestTable
(
ID BIGINT IDENTITY(1,1),
PARENT_ID BIGINT NOT NULL,
CONSTRAINT PK_TestTable PRIMARY KEY CLUSTERED (ID)
);
CREATE NONCLUSTERED INDEX IX_TestTable_ParentId
ON TestTable (PARENT_ID);
I'm trying to apply the following patch:
-- Create non-nullable column with default (should be online operation in Enterprise/Developer edition)
ALTER TABLE TestTable
ADD ORDINAL TINYINT NOT NULL CONSTRAINT DF_TestTable_Ordinal DEFAULT 0;
GO
-- Populate column value for existing data
BEGIN
SET NOCOUNT ON;
DECLARE #BATCH_SIZE BIGINT = 1000000;
DECLARE #COUNTER BIGINT = 0;
DECLARE #ROW_ID BIGINT;
DECLARE #ORDINAL BIGINT;
DECLARE ROWS_C CURSOR
LOCAL FORWARD_ONLY FAST_FORWARD READ_ONLY
FOR
SELECT
ID AS ID,
ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
FROM
TestTable;
OPEN ROWS_C;
FETCH NEXT FROM ROWS_C
INTO #ROW_ID, #ORDINAL;
BEGIN TRANSACTION;
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE TestTable
SET
ORDINAL = CAST(#ORDINAL AS TINYINT)
WHERE
ID = #ROW_ID;
FETCH NEXT FROM ROWS_C
INTO #ROW_ID, #ORDINAL;
SET #COUNTER = #COUNTER + 1;
IF #COUNTER = #BATCH_SIZE
BEGIN
COMMIT TRANSACTION;
SET #COUNTER = 0;
BEGIN TRANSACTION;
END;
END;
COMMIT TRANSACTION;
CLOSE ROWS_C;
DEALLOCATE ROWS_C;
SET NOCOUNT OFF;
END;
GO
-- Drop default constraint from the column
ALTER TABLE TestTable
DROP CONSTRAINT DF_TestTable_Ordinal;
GO
-- Drop IX_TestTable_ParentId index
DROP INDEX IX_TestTable_ParentId
ON TestTable;
GO
-- Create IX_TestTable_ParentId_Ordinal index
CREATE UNIQUE INDEX IX_TestTable_ParentId_Ordinal
ON TestTable (PARENT_ID, ORDINAL);
GO
The aim of patch is to add a column, called ORDINAL, which is an ordinal number of the record within the same parent (defined by PARENT_ID). The patch is run using SQLCMD.
The patch is done is this way for a set of reasons:
Table is too large to run a single UPDATE statement on it (takes enormous amount of time and space in transaction log/tempdb).
Batch updates using a single UPDATE statement with TOP n rows are not simple to implement (if we update table in, say, 1m rows batches, 1000001st row may belong to the same PARENT_ID as 1000000th which will lead to wrong ordinal number assigned to 1000001st record). In other words, SELECT statement run in cursor should be run once (without paging) or more complicated operations (joins/conditions) should be applied.
Adding NULL column and changing it to NOT NULL later is not a good solution since I use SNAPSHOT isolation (full table update will be performed on altering column to be NOT NULL).
The patch works perfect on a small database with a few millions of rows, but, when applied to the one with billions of rows, I get:
Msg 3606, Level 16, State 2, Server XXX, Line 22
Arithmetic overflow occurred.
My first guess was ORDINAL value is too big to fit into TINYINT column, but this is not the case. I created a test database with similar structure and populated with data (more than 255 rows per parent). The error message I get is still arithmetic exception, but with different message code and different wording (explicitly saying it can't fit data into TINYINT).
Currently I have a couple of suspicions, but I haven't managed to find anything that could help me:
CURSOR is not able to handle more than MAX(INT32) rows.
SQLCMD imposed limitations.
Do you have any ideas on what could the problem be?

How about using a While loop but making sure that you keep the same parent_ids together:
DECLARE #SegmentSize BIGINT = 1000000
DECLARE #CurrentSegment BigInt = 0
WHILE 1 = 1
BEGIN
;With UpdateData As
(
SELECT ID AS ID,
ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
FROM TestData
WHERE ID > #CurrentSegment AND ID <= (#CurrentSegment + #SegmentSize)
)
UPDATE TestData
SET Ordinal = UpdateDate.Ordinal
FROM TestData
INNER JOIN UpdateData ON TestData.Id = UpdateData.Id
IF ##ROWCOUNT = 0
BEGIN
BREAK
END
SET #CurrentSegment = #CuurentSegment + #SegmentSize
END
EDIT - Amended to segment on Parent_Id as per request. This should be
reasonably quick as Parent_id is indexed (added Option(Recompile)
to ensure that actual value is used for the lookup.
Because you are not updating
the whole table this will limit the Transaction Log growth!
DECLARE #SegmentSize BIGINT = 1000000
DECLARE #CurrentSegment BigInt = 0
WHILE 1 = 1
BEGIN
;With UpdateData As
(
SELECT ID AS ID,
ROW_NUMBER() OVER (PARTITION BY PARENT_ID ORDER BY ID ASC) AS ORDINAL
FROM TestData
WHERE Parent_ID > #CurrentSegment AND
Parent_ID <= (#CurrentSegment + #SegmentSize)
)
UPDATE TestData
SET Ordinal = UpdateDate.Ordinal
FROM TestData
INNER JOIN UpdateData ON TestData.Id = UpdateData.Id
OPTION (RECOMPILE)
IF ##ROWCOUNT = 0
BEGIN
BREAK
END
SET #CurrentSegment = #CuurentSegment + #SegmentSize
END

Related

In SQL Server 2008 R2, is there a way to create a custom auto increment identity field without using IDENTITY(1,1)?

I would like to be able to pull the custom key value from a table, but would also like it to perform like SQL Server's IDENTITY(1,1) column on inserts.
The custom key is for another application and will need to be used by different functions so the value will need to be pulled from a table and available for other areas.
Here are some if my attempts:
Tried a trigger on the table works well on single inserts, failed on using SQL insert (forgetting the fact that a triggers are not per row but by batch)
ALTER TRIGGER [sales].[trg_NextInvoiceDocNo]
ON [sales].[Invoice]
AFTER INSERT
AS
BEGIN
DECLARE #ResultVar VARCHAR(25)
DECLARE #Key VARCHAR(25)
EXEC [dbo].[usp_GetNextKeyCounterChar]
#tcForTbl = 'docNbr', #tcForGrp = 'docNbr', #NewKey = #ResultVar OUTPUT
UPDATE sales.InvoiceRET
SET DocNbr = #ResultVar
FROM sales.InvoiceRET
JOIN inserted ON inserted.id = sales.InvoiceRET.id;
END;
Thought about a scalar function, but functions cannot exec stored procedures or update statements in order to set the new key value in the lookup table.
Thanks

You can use ROW_NUMBER() depending on the type of concurrency you are dealing with. Here is some sample data and a demo you can run locally.
-- Sample table
USE tempdb
GO
IF OBJECT_ID('dbo.sometable','U') IS NOT NULL DROP TABLE dbo.sometable;
GO
CREATE TABLE dbo.sometable
(
SomeId INT NULL,
Col1 INT NOT NULL
);
GO
-- Stored Proc to insert data
CREATE PROC dbo.InsertProc #output BIT AS
BEGIN -- Your proc starts here
INSERT dbo.sometable(Col1)
SELECT datasource.[value]
FROM (VALUES(CHECKSUM(NEWID())%100)) AS datasource([value]) -- simulating data from somewhere
CROSS APPLY (VALUES(1),(1),(1)) AS x(x);
WITH
id(MaxId) AS (SELECT ISNULL(MAX(t.SomeId),0) FROM dbo.sometable AS t),
xx AS
(
SELECT s.SomeId, RN = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))+id.MaxId, s.Col1, id.MaxId
FROM id AS id
CROSS JOIN dbo.sometable AS s
WHERE s.SomeId IS NULL
)
UPDATE xx SET xx.SomeId = xx.RN;
IF #output = 1
SELECT t.* FROM dbo.sometable AS t;
END
GO
Each time I run: EXEC dbo.InsertProc 1; it returns 3 more rows with the correct ID col. Each time I execute it, it adds more rows and auto-increments as needed.
SomeId Col1
-------- ------
1 62
2 73
3 -17

Generate a unique column sequence value based on a query handling concurrency

I have a requirement to automatically generate a column's value based on another query's result. Because this column value must be unique, I need to take into consideration concurrent requests. This query needs to generate a unique value for a support ticket generator.
The template for the unique value is CustomerName-Month-Year-SupportTicketForThisMonthCount.
So the script should automatically generate:
AcmeCo-10-2019-1
AcmeCo-10-2019-2
AcmeCo-10-2019-3
and so on as support tickets are created. How can ensure that AcmeCo-10-2019-1 is not generated twice if two support tickets are created at the same time for AcmeCo?
insert into SupportTickets (name)
select concat_ws('-', #CustomerName, #Month, #Year, COUNT())
from SupportTickets
where customerName = #CustomerName
and CreatedDate between #MonthStart and #MonthEnd;

One possibility:
Create a counter table:
create table Counter (
Id int identify(1,1),
Name varchar(64)
Count1 int
)
Name is a unique identifier for the sequence, and in your case name would be CustomerName-Month-Year i.e. you would end up with a row in this table for every Customer/Year/Month combination.
Then write a stored procedure similar to the following to allocate a new sequence number:
create procedure [dbo].[Counter_Next]
(
#Name varchar(64)
, #Value int out -- Value to be used
)
as
begin
set nocount, xact_abort on;
declare #Temp int;
begin tran;
-- Ensure we have an exclusive lock before changing variables
select top 1 1 from dbo.Counter with (tablockx);
set #Value = null; -- if a value is passed in it stuffs us up, so null it
-- Attempt an update and assignment in a single statement
update dbo.[Counter] set
#Value = Count1 = Count1 + 1
where [Name] = #Name;
if ##rowcount = 0 begin
set #Value = 10001; -- Some starting value
-- Create a new record if none exists
insert into dbo.[Counter] ([Name], Count1)
select #Name, #Value;
end;
commit tran;
return 0;
end;

You could look into using a TIME type instead of COUNT() to create unique values. That way it is much less likely to have duplicates. Hope that helps

SQL Server : Merge across multiple tables with foreign key

Here's what I am trying to do: basically send XML to SQL Server to update/insert (Merge) my data as a "save" function in my code.
I have managed to successfully do this if I send one "item" in the XML using the following XML:
<root>
<Formula1>
<M_iFormula1Id>0</M_iFormula1Id>
<M_bDataInUse>0</M_bDataInUse>
<M_bActive>1</M_bActive>
<M_lstItem>
<M_iItemId>0</M_iItemId>
<M_iItemTypeId>1</M_iItemTypeId>
<M_sItemValue>German</M_sItemValue>
<M_iRaceId>1</M_iRaceId>
<M_iDriverId>50</M_iDriverId>
</M_lstItem>
</Formula1>
</root>
in this stored procedure:
ALTER PROCEDURE [dbo].[spFormula1_Save]
#Formula1Xml xml--Formula1 as xml
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
IF DATALENGTH(#Formula1Xml) = 0
RETURN 0
BEGIN TRANSACTION
BEGIN TRY
DECLARE #hDoc INT
EXEC sp_xml_preparedocument #hDoc OUTPUT, #Formula1Xml
-------------------
--Formula1 Table
-------------------
DECLARE #Formula1Id bigint = 0;
MERGE INTO Formula1 AS tab
USING
OPENXML (#hDoc, '/root/Formula1', 2)
WITH (
M_iFormula1Id bigint,
M_bDataInUse bit,
M_bActive bit
) AS [xml]
ON (tab.Formula1Id = [xml].[M_iFormula1Id])
WHEN MATCHED THEN UPDATE SET tab.DataInUse = [xml].M_bDataInUse,
tab.Active = [xml].M_bActive,
#Formula1Id = [xml].M_iFormula1Id
WHEN NOT MATCHED THEN INSERT (DataInUse,
Active)
VALUES([xml].M_bDataInUse,
[xml].M_bActive
);
IF(#Formula1Id = 0)--then we haven''t updated so get inserted rowid
BEGIN
SET #Formula1Id = SCOPE_IDENTITY();--get the inserted identity
END
-------------------
--Formula1Item Table
-------------------
MERGE INTO Formula1Item AS tab
USING
OPENXML (#hDoc, '/root/Formula1/M_lstItem', 2)
WITH (
M_iItemId bigint,
M_iItemTypeId bit,
M_sItemValue varchar(1000),
M_iRaceId int,
M_iDriverId int
) AS [xml]
ON (tab.ItemId = [xml].M_iItemId)
WHEN MATCHED THEN UPDATE SET tab.ItemTypeId = [xml].M_iItemTypeId,
tab.ItemValue = [xml].M_sItemValue,
tab.RaceId = [xml].M_iRaceId,
tab.DriverId = [xml].M_iDriverId
WHEN NOT MATCHED THEN INSERT (Formula1Id,
ItemTypeId,
ItemValue,
RaceId,
DriverId)
VALUES(#Formula1Id,
[xml].M_iItemTypeId,
[xml].M_sItemValue,
[xml].M_iRaceId,
[xml].M_iDriverId
);
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
When I have multiple records in the XML the #Formula1Id gets set to the last one inserted in the first merge statement so all the Child data in the XML gets merged using this id, meaning all child data belongs to one parent!
<root>
<Formula1>
<M_iFormula1Id>0</M_iFormula1Id>
<M_bDataInUse>0</M_bDataInUse>
<M_bActive>1</M_bActive>
<M_lstItem>
<M_iItemId>0</M_iItemId>
<M_iItemTypeId>1</M_iItemTypeId>
<M_sItemValue>German</M_sItemValue>
<M_iRaceId>1</M_iRaceId>
<M_iDriverId>50</M_iDriverId>
</M_lstItem>
</Formula1>
<Formula1>
<M_iFormula1Id>0</M_iFormula1Id>
<M_bDataInUse>0</M_bDataInUse>
<M_bActive>1</M_bActive>
<M_lstItem>
<M_iItemId>0</M_iItemId>
<M_iItemTypeId>1</M_iItemTypeId>
<M_sItemValue>French</M_sItemValue>
<M_iRaceId>2</M_iRaceId>
<M_iDriverId>50</M_iDriverId>
</M_lstItem>
</Formula1>
</root>
Is there any way to perform this keeping the foreign key relationship correct.
Perhaps the Merge statement is the wrong way to go but it seems like the best way to handle a lot of inserts/updates at once.
Maybe you could suggest an alternative method - the main criteria is performance as there could be thousands of items to "save" - I have tried to look at SqlBulkCopy but this doesn't seem to handle foreign key relationships very well either... I know I could save to one table at a time but then I lose the ROLLBACK functionality should one part of the "save" go wrong!
Any help/suggestions are greatly appreciated. Thanks in advance.

Try using following solution (it's not tested; I assumed that you can have many "Formula1" elements; you should carefully read my notes):
ALTER PROCEDURE [dbo].[spFormula1_Save]
#Formula1Xml xml--Formula1 as xml
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT, XACT_ABORT ON;
IF DATALENGTH(#Formula1Xml) = 0
RETURN 0
------------------------
--Xml shredding
------------------------
-- I prefer using the new XML methods (nodes, value, exist) instead of sp_xml_preparedocument + OPENXML
-- because you may get memory leaks if we don't use sp_xml_removedocument
DECLARE #Formula1_Table TABLE
(
M_iFormula1Id bigint,
Rnk bigint primary key, -- It's used to unique identify the old and the new rows
M_bDataInUse bit,
M_bActive bit
);
INSERT #Formula1_Table (M_iFormula1Id, Rnk, M_bDataInUse, M_bActive)
SELECT x.XmlCol.value('(M_iFormula1Id)[1]', 'BIGINT') AS M_iFormula1Id,
ROW_NUMBER() OVER(ORDER BY x.XmlCol) AS Rnk, -- It's used to unique identify the old and the new rows
x.XmlCol.value('(M_bDataInUse)[1]', 'BIT') AS M_bDataInUse,
x.XmlCol.value('(M_bActive)[1]', 'BIT') AS M_bActive
FROM #Formula1Xml.nodes('/root/Formula1') x(XmlCol);
DECLARE #Formula1_M_lstItem_Table TABLE
(
M_iFormula1Id bigint,
Rnk bigint, -- It's used to unique identify new "Formula1" rows (those rows having M_iFormula1Id=0)
M_iItemId bigint,
M_iItemTypeId bit,
M_sItemValue varchar(1000),
M_iRaceId int,
M_iDriverId int
);
INSERT #Formula1_M_lstItem_Table
(
M_iFormula1Id,
Rnk,
M_iItemId,
M_iItemTypeId,
M_sItemValue,
M_iRaceId,
M_iDriverId
)
SELECT /*x.XmlCol.value('(M_iFormula1Id)[1]', 'BIGINT')*/
-- At this moment we insert only nulls
NULL AS M_iFormula1Id,
DENSE_RANK() OVER(ORDER BY x.XmlCol) AS Rnk, -- It's used to unique identify new and old "Formula1" rows
y.XmlCol.value('(M_iItemId)[1]', 'BIGINT') AS M_iItemId,
y.XmlCol.value('(M_iItemTypeId)[1]', 'BIT') AS M_iItemTypeId,
y.XmlCol.value('(M_sItemValue)[1]', 'VARCHAR(1000)') AS M_sItemValue,
y.XmlCol.value('(M_iRaceId)[1]', 'INT') AS M_iRaceId,
y.XmlCol.value('(M_iDriverId)[1]', 'INT') AS M_iDriverId
FROM #Formula1Xml.nodes('/root/Formula1') x(XmlCol)
CROSS APPLY x.XmlCol.nodes('M_lstItem') y(XmlCol);
------------------------
--End of Xml shredding
------------------------
BEGIN TRANSACTION
BEGIN TRY
-------------------
--Formula1 Table
-------------------
DECLARE #Merged_Rows TABLE
(
Merge_Action nvarchar(10) not null,
Rnk bigint not null,
M_iFormula1Id bigint -- The old id's and the new inserted id's.
);
DECLARE #Formula1Id bigint = 0;
MERGE INTO Formula1 WITH(HOLDLOCK) AS tab -- To prevent race condition. http://weblogs.sqlteam.com/dang/archive/2009/01/31/UPSERT-Race-Condition-With-MERGE.aspx
USING #Formula1_Table AS [xml]
ON (tab.Formula1Id = [xml].[M_iFormula1Id])
WHEN MATCHED THEN UPDATE SET tab.DataInUse = [xml].M_bDataInUse,
tab.Active = [xml].M_bActive
-- We no more need this line because of OUTPUT clause
-- #Formula1Id = [xml].M_iFormula1Id
WHEN NOT MATCHED THEN INSERT (DataInUse,
Active)
VALUES([xml].M_bDataInUse,
[xml].M_bActive
)
-- This OUTPUT clause will insert into #Merged_Rows the Rnk and the new M_iFormula1Id for every /root/Formula1 element
-- http://msdn.microsoft.com/en-us/library/ms177564.aspx
OUTPUT $action, [xml].Rnk, inserted.M_iFormula1Id INTO #Merged_Rows (Merge_Action, Rnk, M_iFormula1Id);
-- This is replaced by previous OUTPUT clause
/*
IF(#Formula1Id = 0)--then we haven''t updated so get inserted rowid
BEGIN
SET #Formula1Id = SCOPE_IDENTITY();--get the inserted identity
END
*/
-- At this moment we replace all previously inserted NULLs with the real (old and new) id's
UPDATE x
SET M_iFormula1Id = y.M_iFormula1Id
FROM #Formula1_M_lstItem_Table x
JOIN #Merged_Rows y ON x.Rnk = y.Rnk;
-------------------
--Formula1Item Table
-------------------
MERGE INTO Formula1Item AS tab
USING #Formula1_M_lstItem_Table AS [xml]
ON (tab.ItemId = [xml].M_iItemId)
-- Maybe you should need also this join predicate (tab.M_iFormula1Id = [xml].M_iFormula1Id)
WHEN MATCHED THEN UPDATE SET tab.ItemTypeId = [xml].M_iItemTypeId,
tab.ItemValue = [xml].M_sItemValue,
tab.RaceId = [xml].M_iRaceId,
tab.DriverId = [xml].M_iDriverId
WHEN NOT MATCHED THEN INSERT (Formula1Id,
ItemTypeId,
ItemValue,
RaceId,
DriverId)
VALUES([xml].M_iFormula1Id,
[xml].M_iItemTypeId,
[xml].M_sItemValue,
[xml].M_iRaceId,
[xml].M_iDriverId
);
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
-- The caller should be informed when an error / exception is catched
-- THROW
END CATCH;
END

Generating the Next Id when Id is non-AutoNumber

I have a table called Employee. The EmpId column serves as the primary key. In my scenario, I cannot make it AutoNumber.
What would be the best way of generating the the next EmpId for the new row that I want to insert in the table?
I am using SQL Server 2008 with C#.
Here is the code that i am currently getting, but to enter Id's in key value pair tables or link tables (m*n relations)
Create PROCEDURE [dbo].[mSP_GetNEXTID]
#NEXTID int out,
#TABLENAME varchar(100),
#UPDATE CHAR(1) = NULL
AS
BEGIN
DECLARE #QUERY VARCHAR(500)
BEGIN
IF EXISTS (SELECT LASTID FROM LASTIDS WHERE TABLENAME = #TABLENAME and active=1)
BEGIN
SELECT #NEXTID = LASTID FROM LASTIDS WHERE TABLENAME = #TABLENAME and active=1
IF(#UPDATE IS NULL OR #UPDATE = '')
BEGIN
UPDATE LASTIDS
SET LASTID = LASTID + 1
WHERE TABLENAME = #TABLENAME
and active=1
END
END
ELSE
BEGIN
SET #NEXTID = 1
INSERT INTO LASTIDS(LASTID,TABLENAME, ACTIVE)
VALUES(#NEXTID+1,#TABLENAME, 1)
END
END
END

Using MAX(id) + 1 is a bad idea both performance and concurrency wise.
Instead you should resort to sequences which were design specifically for this kind of problem.
CREATE SEQUENCE EmpIdSeq AS bigint
START WITH 1
INCREMENT BY 1;
And to generate the next id use:
SELECT NEXT VALUE FOR EmpIdSeq;
You can use the generated value in a insert statement:
INSERT Emp (EmpId, X, Y)
VALUES (NEXT VALUE FOR EmpIdSeq, 'x', 'y');
And even use it as default for your column:
CREATE TABLE Emp
(
EmpId bigint PRIMARY KEY CLUSTERED
DEFAULT (NEXT VALUE FOR EmpIdSeq),
X nvarchar(255) NULL,
Y nvarchar(255) NULL
);
Update: The above solution is only applicable to SQL Server 2012+. For older versions you can simulate the sequence behavior using dummy tables with identity fields:
CREATE TABLE EmpIdSeq (
SeqID bigint IDENTITY PRIMARY KEY CLUSTERED
);
And procedures that emulates NEXT VALUE:
CREATE PROCEDURE GetNewSeqVal_Emp
#NewSeqVal bigint OUTPUT
AS
BEGIN
SET NOCOUNT ON
INSERT EmpIdSeq DEFAULT VALUES
SET #NewSeqVal = scope_identity()
DELETE FROM EmpIdSeq WITH (READPAST)
END;
Usage exemple:
DECLARE #NewSeqVal bigint
EXEC GetNewSeqVal_Emp #NewSeqVal OUTPUT
The performance overhead of deleting the last inserted element will be minimal; still, as pointed out by the original author, you can optionally remove the delete statement and schedule a maintenance job to delete the table contents off-hour (trading space for performance).
Adapted from SQL Server Customer Advisory Team Blog.
Working SQL Fiddle

The above
select max(empid) + 1 from employee
is the way to get the next number, but if there are multiple user inserting into the database, then context switching might cause two users to get the same value for empid and then add 1 to each and then end up with repeat ids. If you do have multiple users, you may have to lock the table while inserting. This is not the best practice and that is why the auto increment exists for database tables.

I hope this works for you. Considering that your ID field is an integer
INSERT INTO Table WITH (TABLOCK)
(SELECT CASE WHEN MAX(ID) IS NULL
THEN 1 ELSE MAX(ID)+1 END FROM Table), VALUE_1, VALUE_2....

Try following query
INSERT INTO Table VALUES
((SELECT isnull(MAX(ID),0)+1 FROM Table), VALUE_1, VALUE_2....)
you have to check isnull in on max values otherwise it will return null in final result when table contain no rows .

How to keep a rolling checksum in SQL?

I am trying to keep a rolling checksum to account for order, so take the previous 'checksum' and xor it with the current one and generate a new checksum.
Name Checksum Rolling Checksum
------ ----------- -----------------
foo 11829231 11829231
bar 27380135 checksum(27380135 ^ 11829231) = 93291803
baz 96326587 checksum(96326587 ^ 93291803) = 67361090
How would I accomplish something like this?
(Note that the calculations are completely made up and are for illustration only)

This is basically the running total problem.
Edit:
My original claim was that is one of the few places where a cursor based solution actually performs best. The problem with the triangular self join solution is that it will repeatedly end up recalculating the same cumulative checksum as a subcalculation for the next step so is not very scalable as the work required grows exponentially with the number of rows.
Corina's answer uses the "quirky update" approach. I've adjusted it to do the check sum and in my test found that it took 3 seconds rather than 26 seconds for the cursor solution. Both produced the same results. Unfortunately however it relies on an undocumented aspect of Update behaviour. I would definitely read the discussion here before deciding whether to rely on this in production code.
There is a third possibility described here (using the CLR) which I didn't have time to test. But from the discussion here it seems to be a good possibility for calculating running total type things at display time but out performed by the cursor when the result of the calculation must be saved back.
CREATE TABLE TestTable
(
PK int identity(1,1) primary key clustered,
[Name] varchar(50),
[CheckSum] AS CHECKSUM([Name]),
RollingCheckSum1 int NULL,
RollingCheckSum2 int NULL
)
/*Insert some random records (753,571 on my machine)*/
INSERT INTO TestTable ([Name])
SELECT newid() FROM sys.objects s1, sys.objects s2, sys.objects s3
Approach One: Based on the Jeff Moden Article
DECLARE #RCS int
UPDATE TestTable
SET #RCS = RollingCheckSum1 =
CASE WHEN #RCS IS NULL THEN
[CheckSum]
ELSE
CHECKSUM([CheckSum] ^ #RCS)
END
FROM TestTable WITH (TABLOCKX)
OPTION (MAXDOP 1)
Approach Two - Using the same cursor options as Hugo Kornelis advocates in the discussion for that article.
SET NOCOUNT ON
BEGIN TRAN
DECLARE #RCS2 INT
DECLARE #PK INT, #CheckSum INT
DECLARE curRollingCheckSum CURSOR LOCAL STATIC READ_ONLY
FOR
SELECT PK, [CheckSum]
FROM TestTable
ORDER BY PK
OPEN curRollingCheckSum
FETCH NEXT FROM curRollingCheckSum
INTO #PK, #CheckSum
WHILE ##FETCH_STATUS = 0
BEGIN
SET #RCS2 = CASE WHEN #RCS2 IS NULL THEN #CheckSum ELSE CHECKSUM(#CheckSum ^ #RCS2) END
UPDATE dbo.TestTable
SET RollingCheckSum2 = #RCS2
WHERE #PK = PK
FETCH NEXT FROM curRollingCheckSum
INTO #PK, #CheckSum
END
COMMIT
Test they are the same
SELECT * FROM TestTable
WHERE RollingCheckSum1<> RollingCheckSum2

I'm not sure about a rolling checksum, but for a rolling sum for instance, you can do this using the UPDATE command:
declare #a table (name varchar(2), value int, rollingvalue int)
insert into #a
select 'a', 1, 0 union all select 'b', 2, 0 union all select 'c', 3, 0
select * from #a
declare #sum int
set #sum = 0
update #a
set #sum = rollingvalue = value + #sum
select * from #a

Select Name, Checksum
, (Select T1.Checksum_Agg(Checksum)
From Table As T1
Where T1.Name < T.Name) As RollingChecksum
From Table As T
Order By T.Name
To do a rolling anything, you need some semblance of an order to the rows. That can be by name, an integer key, a date or whatever. In my example, I used name (even though the order in your sample data isn't alphabetical). In addition, I'm using the Checksum_Agg function in SQL.
In addition, you would ideally have a unique value on which you compare the inner and outer query. E.g., Where T1.PK < T.PK for an integer key or even string key would work well. In my solution if Name had a unique constraint, it would also work well enough.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Arithmetic overflow on large table - sql

Related

In SQL Server 2008 R2, is there a way to create a custom auto increment identity field without using IDENTITY(1,1)?

Generate a unique column sequence value based on a query handling concurrency

SQL Server : Merge across multiple tables with foreign key

Generating the Next Id when Id is non-AutoNumber

How to keep a rolling checksum in SQL?

Categories

Resources