Can an updated with nested select be considered atomic in Sybase? - sql

I am trying something like this:
set rowcount 10 //fetch only 10 row
Update tableX set x=#BatchId where id in (select id from tableX where x=0)
basically mark 10 record as booked by supplying a batchId.
So my question is if this proc is executed in parallel then can I guarantee that update with select will be atomic and no invocation will select similar setof record from tableX for booking?
Thanks

To guarantee that no such overlaps occur, yo should:
(i) put BEGIN TRANSACTION - COMMIT around the statement
(ii) put the HOLDLOCK keyword directly behind 'tableX' (or run the whole statement at isolation level 3).

Related

SQL Concurrent transactions with SELECT

i have transaction which looks like this:
BEGIN;
SELECT last_table_row;
...some long queries with last_table_row data...
DELETE selected_last_table_row;
COMMIT;
if i run this transaction twice at same time second one will use same last_table_row item as first transaction (tried from psql console) which is not desired. How should i handle this type of transaction to be safe and not having interfering transactions?
Use SELECT FOR UPDATE.
If FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE or FOR KEY SHARE is specified, the SELECT statement locks the selected rows against concurrent updates.
Here is the documentation on The Locking Clause.
As well as SELECT FOR UPDATE, considering you'll remove that row at the end of the transaction, you may as well remove it at the beginning copying into a temporary table:
BEGIN;
CREATE TEMP TABLE last_table_row ON COMMIT DROP AS
WITH ltr AS (
DELETE FROM yourtable
WHERE id = (SELECT MAX(id) FROM yourtable)
RETURNING *)
SELECT * FROM ltr;
...some long queries with last_table_row data...
COMMIT;

How to guarantee only one process picks up a processing task

I have multiple computers that have the task of sending out emails found in a table on a common SQL Server. Each computer polls the email table to look for messages it can send by looking at a status flag set to 0. If a computer does a
SELECT * FROM tblEmailQueue where StatusFlag=0
if it returns a record it immediately sets the StatusFlag to 1 which should cause the other computer polling the same table not to find this record. My fear is that if two computer find the record at the same time before either can update the StatusFlag, the email will be sent twice. Does anyone have ideas on how to ensure only one computer will get the record? I know I might be able to do a table lock but I would rather now have to do this.
Instead of using two queries which may cause a race condition, you can update the values and output the updated rows at once using the OUTPUT clause.
This will update the rows with statusflag=0 and output all of the updated ones;
UPDATE tblEmailQueue
SET statusflag=1
OUTPUT DELETED.*
WHERE statusflag=0;
An SQLfiddle to test with.
EDIT: If you're picking one row, you may want some ordering. Since the update itself can't order, you can use a common table expression to do the update;
WITH cte AS (
SELECT TOP 1 id, statusflag FROM tblEmailQueue
WHERE statusflag = 0 ORDER BY id
)
UPDATE cte SET statusflag=1 OUTPUT DELETED.*;
Another SQLfiddle.
You can perform select and send email in the same transaction. Also you can use ROWLOCK hint and don't commit transaction until you send email or set new value for StatusFlag. It means that nobody (exept transaction with hint NOLOCK or READ UNCOMMITED isolation level) can read this row as long as you commit transaction.
SELECT * FROM tblEmailQueue WITH(ROWLOCK) where StatusFlag=0
In addition you should check isolation level. For your case isolation level should be READ COMMITED or REPEATABLE READ.
See information about isolation levels here
Add another column to your table tblEmailQueue (say UserID), then try to pull one email such as
--Let flag an email and assign it to the application who made the request
--#CurrentUserID is an id unique to each application or user and supplied by the application
--or user who made the request, this will also ensures that the record is going to
--the right application and perhaps you can use it for other purpose such as monitoring.
UPDATE tblEmailQueue set UserID = #CurrentUserID, StatusFlag=1 where ID = isnull(
select top 1 ID from tblEmailQueue where StatusFlag=0 order by ID
), 0)
--Lets get an email that had a flag for the current user id
SELECT * FROM tblEmailQueue where StatusFlag=1 and UserID = #CurrentUserID
Here in Indianapolis, we are familiar with race conditions ;-)
Lets assume you actually have and ID field and a StatusFlag field and create a stored proc that includes
declare #id int
select top 1 #id = id from tblEmailQuaue where StatusFlag=0
if ##rowcount = 1
begin
update tblEmailQuaue set StatusFlag=1 where ID = #id AND StatusFlag=0
if ##rowcount = 1
begin
-- I won the race, continue processing
...
end
end
ADDED
An explicit handling like this is inferior to Joachim's method if all you want is the result of the select. But this method this method also works with old versions of SQL server as well as other databases.

Do databases always lock non-existent rows after a query or update?

Given:
customer[id BIGINT AUTO_INCREMENT PRIMARY KEY, email VARCHAR(30), count INT]
I'd like to execute the following atomically: Update the customer if he already exists; otherwise, insert a new customer.
In theory this sounds like a perfect fit for SQL-MERGE but the database I am using doesn't support MERGE with AUTO_INCREMENT columns.
https://stackoverflow.com/a/1727788/14731 seems to indicate that if you execute a query or update statement against a non-existent row, the database will lock the index thereby preventing concurrent inserts.
Is this behavior guaranteed by the SQL standard? Are there any databases that do not behave this way?
UPDATE: Sorry, I should have mentioned this earlier: the solution must use READ_COMMITTED transaction isolation unless that is impossible in which case I will accept the use of SERIALIZABLE.
This question is asked about once a week on SO, and the answers are almost invariably wrong.
Here's the right one.
insert customer (email, count)
select 'foo#example.com', 0
where not exists (
select 1 from customer
where email = 'foo#example.com'
)
update customer set count = count + 1
where email = 'foo#example.com'
If you like, you can insert a count of 1 and skip the update if the inserted rowcount -- however expressed in your DBMS -- returns 1.
The above syntax is absolutely standard and makes no assumption about locking mechanisms or isolation levels. If it doesn't work, your DBMS is broken.
Many people are under the mistaken impression that the select executes "first" and thus introduces a race condition. No: that select is part of the insert. The insert is atomic. There is no race.
Use Russell Fox's code but use SERIALIZABLE isolation. This will take a range lock so that the non-existing row is logically locked (together with all other non-existing rows in the surrounding key range).
So it looks like this:
BEGIN TRAN
IF EXISTS (SELECT 1 FROM foo WITH (UPDLOCK, HOLDLOCK) WHERE [email] = 'thisemail')
BEGIN
UPDATE foo...
END
ELSE
BEGIN
INSERT INTO foo...
END
COMMIT
Most code taken from his answer, but fixed to provided mutual exclusion semantics.
Answering my own question since there seems to be a lot of confusion around the topic. It seems that:
-- BAD! DO NOT DO THIS! --
insert customer (email, count)
select 'foo#example.com', 0
where not exists (
select 1 from customer
where email = 'foo#example.com'
)
is open to race-conditions (see Only inserting a row if it's not already there). From what I've been able to gather, the only portable solution to this problem:
Pick a key to merge against. This could be the primary key, or another unique key, but it must have a unique constraint.
Try to insert a new row. You must catch the error that will occur if the row already exists.
The hard part is over. At this point, the row is guaranteed to exist and you are protected from race-conditions by the fact that you are holding a write-lock on it (due to the insert from the previous step).
Go ahead and update if needed or select its primary key.
IF EXISTS (SELECT 1 FROM foo WHERE [email] = 'thisemail')
BEGIN
UPDATE foo...
END
ELSE
BEGIN
INSERT INTO foo...
END

is "select count(*) from..." more reliable than ##ROWCOUNT?

I have a proc that inserts records to a temp table. In pseudocode it looks like this:
Create temp table
a. Insert rows into temp table based on stringent criteria
b. if no rows were inserted, insert based on less stringent criteria
c. if there are still no rows, try again with even less stringent criteria
select from temp table
There are a lot of IF ##rowcount = 0 checks in the code to see if the table has any rows. The issue is that the proc isn't really doing what it looks like it should be doing and it's inserting the same row twice (steps a and c are being executed). However, if I change it to check this way IF ( (select count(*) from #temp) = 0) the proc works exactly as expected.
Which makes me think that ##rowcount isn't the best solution to this problem. But I'm adding in extra work via the SELECT COUNT(*). Which is the better solution?
##rowcount is the better solution. The work is already done. Selecting count(*) causes the database to do more work.
You need to make sure you are not doing something that will affect the value of ##rowcount before checking the value of ##rowcount. It is usually best to check ##rowcount immediately after performing the insert statement. If necessary assign the value to a variable so you can check it later:
DECLARE #rows int
...
[insert or update a table]
SET #rows = ##rowcount
Storing the row count immediately after any operation that changes row count will allow you to use the value more than once.
##ROWCOUNT will just check "affected rows" from the previous query. This can be a little far reaching as many things can create rows or "affect" rows and some options don't return a value to the client.
Microsoft says:
Statements that make an assignment in a query or use RETURN in a query set the ##ROWCOUNT value to the number of rows affected or read by the query, for example: SELECT #local_variable = c1 FROM t1.
Data manipulation language (DML) statements set the ##ROWCOUNT value to the number of rows affected by the query and return that value to the client. The DML statements may not send any rows to the client.
DECLARE CURSOR and FETCH set the ##ROWCOUNT value to 1.
EXECUTE statements preserve the previous ##ROWCOUNT.
Statements such as USE, SET , DEALLOCATE CURSOR, CLOSE CURSOR, BEGIN TRANSACTION or COMMIT TRANSACTION reset the ROWCOUNT value to 0.
If you are not getting what you expect from ##ROWCOUNT (which probably means your query is more complex than your example) I would definitely be looking at using the SELECT COUNT(*) option, or, if you're worried about the performance hit do something like this:
INSERT INTO temptable
(cols...)
SELECT
COL = #VAL
FROM
sourcetableorquery
LEFT JOIN temptable on [check for existing row]
WHERE
temptable.id is null
This will be faster than the count option if you are looping over a big recordset.
If it's ever being used twice in a row, that could be a problem:
http://beyondrelational.com/blogs/madhivanan/archive/2010/08/09/proper-usage-of-rowcount.aspx

SQL Server 2005 and SELECT and UPDATE locked

I want to perform a update then select the result. I don't want anything to be able to update the row I am updating until after the select has occurred. How would I do this?
My goal is to increment a value of a row and return that incremented value. I have thus far found that I end up with an issue where update (to increment) followed by a select in a situation where two queries happen at near the same time the selects seem to return the same number. So I am guessing that something like update > update > select > select is happening.
I miss labeled this as SQL Server 2005. I am actually working with Server 2000. So the output clause does not work (is not in that version).
BEGIN TRANSACTION
UPDATE Table SET Last=(Last+1) WHERE ID=someid;
SELECT * FROM Table WHERE ID=someid;
COMMIT TRANSACTION
BEGIN TRAN
UPDATE ...
SELECT...
COMMIT
Should do it even at the default transaction isolation level of read committed.
You could also use the OUTPUT clause to get the row directly back after the update. Example of this
UPDATE <YourTable>
SET ...
OUTPUT INSERTED.*
WHERE ...