SQL Stored Procedure - Understanding SQL Statement - sql

New to stored procedures. Can anyone explain the following SQL sample which appears at the start of a stored procedure?
Begin/End - Encloses a series of SQL statements so that a group of SQL statements can be executed
SET NOCOUNT ON - the count (indicating the number of rows affected by a SQL statement) is not returned.
DECLARE - setting local variables
While - loops round
With - unsure
Update batch - unsure
SET #Rowcount = ##ROWCOUNT; - unsure
BEGIN
SET NOCOUNT ON;
--UPDATE, done in batches to minimise locking
DECLARE #Batch INT= 100;
DECLARE #Rowcount INT= #Batch;
WHILE #Rowcount > 0
BEGIN
WITH t
AS (
SELECT [OrganisationID],
[PropertyID],
[QuestionID],
[BaseAnsweredQuestionID]
FROM dbo.Unioned_Table
WHERE organisationid = 1),
s
AS (
SELECT [OrganisationID],
[PropertyID],
[QuestionID],
[BaseAnsweredQuestionID]
FROM dbo.table
WHERE organisationid = 1),
batch
AS (
SELECT TOP (#Batch) T.*,
s.BaseAnsweredQuestionID NewBaseAnsweredQuestionID
FROM T
INNER JOIN s ON t.organisationid = s.organisationid
AND t.PropertyID = s.PropertyID
AND t.QuestionID = s.QuestionID
WHERE t.BaseAnsweredQuestionID <> s.BaseAnsweredQuestionID)
UPDATE batch
SET
BaseAnsweredQuestionID = NewBaseAnsweredQuestionID
SET #Rowcount = ##ROWCOUNT;
END;

The clue is in the comment --UPDATE, done in batches to minimise locking.
The intent is to update dbo.table's column BaseAnsweredQuestionID with the equivalent column from dbo.Unioned_Table, in batches of 100. The comment suggests the batching logic is necessary to prevent locking.
In detail:
DECLARE #Batch INT= 100; sets the batch size.
DECLARE #Rowcount INT= #Batch; initializes the loop.
WHILE #Rowcount > 0 starts the loop. #Rowcount will become zero when the update statement affects no rows (see below).
with a as () is a common table expression (commonly abbreviated to CTE) - it creates a temporary result set which you can effectively treat as a table. The next few queries define CTEs t, s and batch.
CTE batch contains just 100 rows by using the SELECT TOP (#Batch) term - it selects a random 100 rows from the two other CTEs.
The next statement:
UPDATE batch
SET BaseAnsweredQuestionID = NewBaseAnsweredQuestionID
SET #Rowcount = ##ROWCOUNT
updates the 100 rows in the batch CTE (which in turn is a join on two other CTEs), and populates the loop variable #Rowcount with the number of rows affected by the update statement (##ROWCOUNT). If there are no matching rows, ##ROWCOUNT becomes zero, and thus the loop ends.

Related

Speed up simple update statement in postgres for 1 million rows

I have a very simple sql update statement in postgres.
UPDATE p2sa.observation SET file_path = replace(file_path, 'path/sps', 'newpath/p2s')
The observation table has 1513128 rows. The query so far has been running for around 18 hours with no end in sight.
The file_path column is not indexed so I guess it is doing a top to bottom scan but it seems a bit excessive the time. Probably replace is also a slow operation.
Is there some alternative or better approach for doing this one off kind of update which affects all rows. It is essentially updating an old file path to a new location. It only needs to be updated once or maybe again in the future.
Thanks.
In SQL you could do a while loop to update in batches.
Try this to see how it performs.
Declare #counter int
Declare #RowsEffected int
Declare #RowsCnt int
Declare #CodeId int
Declare #Err int
DECLARE #MaxNumber int = (select COUNT(*) from p2sa.observation)
SELECT #COUNTER = 1
SELECT #RowsEffected = 0
WHILE ( #RowsEffected < #MaxNumber)
BEGIN
SET ROWCOUNT 10000
UPDATE p2sa.observation
SET file_path = replace(file_path, 'path/sps', 'newpath/p2s')
where file_path != 'newpath/p2s'
SELECT #RowsCnt = ##ROWCOUNT ,#Err = ##error
IF #Err <> 0
BEGIN
Print 'Problem Updating the records'
BREAK
END
ELSE
SELECT #RowsEffected = #RowsEffected + #RowsCnt
PRINT 'The total number of rows effected :'+convert(varchar,#RowsEffected)
/*delaying the Loop for 10 secs , so that Update is completed*/
WAITFOR DELAY '00:00:10'
END
SET ROWCOUNT 0

Is this sql update guaranteed to be atomic?

I have the following sql:
UPDATE Customer SET Count=1 WHERE ID=1 AND Count=0
SELECT ##ROWCOUNT
I need to know if this is guaranteed to be atomic.
If 2 users try this simultaneously, will only one succeed and get a return value of 1? Do I need to use a transaction or something else in order to guarantee this?
The goal is to get a unique 'Count' for the customer. Collisions in this system will almost never happen, so I am not concerned with the performance if a user has to query again (and again) to get a unique Count.
EDIT:
The goal is to not use a transaction if it is not needed. Also this logic is ran very infrequently (up to 100 per day), so I wanted to keep it as simple as possible.
It may depend on the sql server you are using. However for most, the answer is yes. I guess you are implementing a lock.
Using SQL SERVER (v 11.0.6020) that this is indeed an atomic operation as best as I can determine.
I wrote some test stored procedures to try to test this logic:
-- Attempt to update a Customer row with a new Count, returns
-- The current count (used as customer order number) and a bit
-- which determines success or failure. If #Success is 0, re-run
-- the query and try again.
CREATE PROCEDURE [dbo].[sp_TestUpdate]
(
#Count INT OUTPUT,
#Success BIT OUTPUT
)
AS
BEGIN
DECLARE #NextCount INT
SELECT #Count=Count FROM Customer WHERE ID=1
SET #NextCount = #Count + 1
UPDATE Customer SET Count=#NextCount WHERE ID=1 AND Count=#Count
SET #Success=##ROWCOUNT
END
And:
-- Loop (many times) trying to get a number and insert in into another
-- table. Execute this loop concurrently in several different windows
-- using SMSS.
CREATE PROCEDURE [dbo].[sp_TestLoop]
AS
BEGIN
DECLARE #Iterations INT
DECLARE #Counter INT
DECLARE #Count INT
DECLARE #Success BIT
SET #Iterations = 40000
SET #Counter = 0
WHILE (#Counter < #Iterations)
BEGIN
SET #Counter = #Counter + 1
EXEC sp_TestUpdate #Count = #Count OUTPUT , #Success = #Success OUTPUT
IF (#Success=1)
BEGIN
INSERT INTO TestImage (ImageNumber) VALUES (#Count)
END
END
END
This code ran, creating unique sequential ImageNumber values in the TestImage table. This proves that the above SQL update call is indeed atomic. Neither function guaranteed the updates were done, but they did guarantee that no duplicates were created, and no numbers were skipped.

About lock behavior when update row in sql server

Now, I'm trying to increment number sequential in SQL Server with the number provided from users.
I have a problem when multiple user insert a row same time with same number.
I try to update the number that user provided to a temporary table, and I expect when I update the same table with same condition, SQL Server will lock any modified to this row until the current update finished, but it not.
Here is the update statement I used:
UPDATE GlobalParam
SET ValueString = (CAST(ValueString as bigint) + 1)
WHERE Id = 'xxxx'
Could you tell me any way to force the other update command wait until the current command finished ?
This is entire my command :
DECLARE #Result bigint;
UPDATE GlobalParam SET ValueString = (SELECT MAX(Code) FROM Item)
DECLARE #SelectTopStm nvarchar(MAX);
DECLARE #ExistRow int
SET #SelectTopStm = 'SELECT #ExistRow = 1 FROM (SELECT TOP 1 Code FROM Item WHERE Code = '999') temp'
EXEC sp_executesql #SelectTopStm, N'#ExistRow int output', #ExistRow output
IF (#ExistRow is not null)
BEGIN
DECLARE #MaxValue bigint
DECLARE #ReturnUpdateTbl table (ValueString nvarchar(max));
UPDATE GlobalParam SET ValueString = (CAST(ValueString as bigint) + 1)
OUTPUT inserted.ValueString INTO #ReturnUpdateTbl
WHERE [Id] = '333A8E1F-16DD-E411-8280-D4BED9D726B3'
SELECT TOP 1 #MaxValue = CAST(ValueString as bigint) FROM #ReturnUpdateTbl
SET #Result = #MaxValue
END
ELSE
BEGIN
SET #Output = 999
END
END
I write the codes above as a stored procedure.
Here is the real code when I insert one Item:
DECLARE #IncrementResult BIGINT
EXEC IncrementNumberUnique
, (some parameters)..
,#Result = #IncrementResult OUTPUT
INSERT INTO ITEM (Id, Code) VALUES ('xxxx', #IncrementResult)
I create 3 threads and make it run at the same time.
The return result :
Id Code
1 999
2 1000
3 1000
Thanks
If I understood your requirements, try ROWLOCK hint to tell the optimizer to start with locking the rows one by one as the update needs them.
UPDATE GlobalParam WITH(ROWLOCK)
SET ValueString = (CAST(ValueString as bigint) + 1)
WHERE Id = 'xxxx'
By default SQL Server does READ Committed locking which releases READ locks once the read operation is committed. Once the Update statement below is complete, all read locks are released from the table Item.
UPDATE GlobalParam SET ValueString = (SELECT MAX(Code) FROM Item)
Since your INSERT into Item is outside the scope of your procedure. you can run the thread in SERIALIZABLE isolation level. Something like this.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
DECLARE #IncrementResult BIGINT
EXEC IncrementNumberUnique
, (some parameters)..
,#Result = #IncrementResult OUTPUT
INSERT INTO ITEM (Id, Code) VALUES ('xxxx', #IncrementResult)
Changing the isolation level to SERIALIZABLE will increase blocking and contention of resources on item table.
To know more about isolation level, refer this
you should look into identity columns and remove such manual computation of incremental columns if possible.

how to set rowcount to zero in SQL

anyone have an idea how to set rowcount to zero again in SQL.
I have used rowcount to fetch the number of records inserted in the insert statement. I have to use rowcount again to find the rows updated. So i am trying to reset the rowcount again to zero.
code will be look this :
INSERT INTO Table A ......
INSERT INTO statistics (id,inserted_records ) values (1,##rowcount)
---some operations--
Update Table A ....
Update statistics set updated_records=##rowcount where id=
select ##rowcount
only returns the row count from the most recent statement. It doesn't need to be reset.. Executing another statement will automatically reset it.
If for some bizarre reason, you want to make ##rowcount return 0, execute a query that will return 0 rows.
select 1 where 2=3
You can prove this like so.
declare #t table (i int)
declare #stats table(rc int)
insert #t values (1),(2),(3)
-- rowcount is 3
insert #stats values (##rowcount)
-- rowcount is 1
update #t set i=5 where i=4
select ##ROWCOUNT -- rowcount will be 0
#bibinmatthew ##ROWCOUNT automatically resets whenever you do another transaction.
What I would consider doing is
declare #rowsAffected int
insert into table A ...
select #rowsAffected = ##ROWCOUNT
insert into table statistics (id, inserted_records) values (iID, #rowsAffected)
declare #rowsAffected int
update table A ...
select #rowsAffected = ##ROWCOUNT
update table statistics set updated_records = #rowsAffected where id = iID
This way, you are not having to deal directly with the ##ROWCOUNT variable. I have created the #rowsAffected twice because I am assuming that you have the insert and the update scripts in different stored proc's
Information about rowcount are here:
http://technet.microsoft.com/en-us//library/ms187316.aspx
Statements such as USE, SET , DEALLOCATE CURSOR, CLOSE CURSOR,
BEGIN TRANSACTION or COMMIT TRANSACTION reset the ROWCOUNT value to 0
So maybe you can put the statements into a transaction, so they are better isolated against each other.
And you can always just send a request that does not update anything, like UPDATE mytable SET x=1 WHERE 0=1;

I need to run a stored procedure on multiple records

I need to run a stored procedure on a bunch of records. The code I have now iterates through the record stored in a temp table. The stored procedure returns a table of records.
I was wondering what I can do to avoid the iteration if anything.
set #counter = 1
set #empnum = null
set #lname = null
set #fname = null
-- get all punches for employees
while exists(select emp_num, lname, fname from #tt_employees where id = #counter)
begin
set #empnum = 0
select #empnum = emp_num, #lname = lname , #fname= fname from #tt_employees where id = #counter
INSERT #tt_hrs
exec PCT_GetEmpTimeSp
empnum
,#d_start_dt
,#d_end_dt
,#pMode = 0
,#pLunchMode = 3
,#pShowdetail = 0
,#pGetAll = 1
set #counter = #counter + 1
end
One way to avoid this kind of iteration is to analyze the code within the stored procedure and revised so that, rather than processing for one set of inputs at a time, it processes for all sets of inputs at a time. Often enough, this is not possible, which is why iteration loops are not all that uncommon.
A possible alternative is to use APPLY functionality (cross apply, outer apply). To do this, you'd rewrite the procedure as one of the table-type functions, and work that function into the query something like so:
INSERT #tt_hrs
select [columnList]
from #tt_employees
cross apply dbo.PCT_GetEmpTimeFunc(emp_num, #d_start_dt, #d_end_dt, 0, 3, 0, 1)
(It was not clear where all your inputs to the procedure were coming from.)
Note that you still are iterating over calls to the function, but now it's "packed" into one query.
I think you are on the right track.
you can have a temp table with identity column
CREATE TABLE #A (ID INT IDENTITY(1,1) NOT NULL, Name VARCHAR(50))
After records are inserted in to this temp table, find the total number of records in the table.
DECLARE #TableLength INTEGER
SELECT #TableLength = MAX(ID) FROM #A
DECLARE #Index INT
SET #Index = 1
WHILE (#Index <=#TableLength)
BEGIN
-- DO your work here
SET #Index = #Index + 1
END
Similar to what you have already proposed.
Alternative to iterate over records is to use CURSOR. CURSORS should be avoided at any cost.