I have a big dataset in Oracle which I need to process some of the rows which has a column PROCESSED= 0.
I have multiple instances of an application which will read 1 row at a time and perform the processing. To avoid multiple threads to access to the same row - I am using
SELECT * FROM FOO WHERE ROWNUM = 1 FOR UPDATE
If I execute the above query, the first thread is locking the row and the other rows are not able to fetch any rows as the ROWUM = 1 is already locked by the first thread. What I am trying to achieve is to fetch the "next unlocked" row.
Is there an efficient way to do it via SQL?
Looks like SKIP LOCKED is what are you looking for.
See documentation
select * from foo for update skip locked
will select only those rows which are not locked by other transactions
Related
I have an application that selects row with a particular status and then starts processing these rows. However some long running processing can cause a new instance of my program to be started, selecting the same rows again because it haven't had time to update the status yet. So I'm thinking of selecting my rows and then updating the status to something else so they cannot be selected again. I have done some searching and got the impression that the following should work, but it fails.
UPDATE table SET status = 5 WHERE status in
(SELECT TOP (10) * FROM table WHERE status = 1)
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
TLDR: Is it possible to both select and update rows at the same time? (The order doesn't really matter)
You can use an output clause to update the rows and then return them as if they were selected.
As for updating the top 100 rows only, a safe approach is to use a cte to select the relevant rows first (I assumed that column id can be used to order the rows):
with cte as (select top (100) * from mytable where status = 1 order by id)
update cte set status = 5 output inserted.*
You can directly go for UPDATE statement. It will generate exclusive lock on the row and other concurrent transactions cannot read this row.
More information on locks
Exclusive locks: Exclusive locks are used to lock data being modified by one transaction thus preventing modifications by other
concurrent transactions. You can read data held by exclusive lock only
by specifying a NOLOCK hint or using a read uncommitted isolation
level. Because DML statements first need to read the data they want to
modify you'll always find Exclusive locks accompanied by shared locks
on that same data.
UPDATE TOP(10) table
SET status = 5 WHERE status =1
I have a table that contains order numbers and order States (IN_PROGRESS, CANCELED, READY_FOR_PROC). I need to write a query that would return any single string in the state READY_FOR_PROC.The problem is that this query will be executed by multiple threads. And everyone should get a record that has not yet been processed by other threads (without duplicates).I tried to do this with SELECT FOR UPDATE skip locked and rownum=1, but then all executed queries except one return empty (if the first thread blocked the record for a long time). How do I write such a query?
I use Oracle if it's important
Change your skip locked slightly to do this:
cursor C is SELECT ... FROM ... FOR UPDATE SKIP LOCKED
then in your code, you just fetch
FETCH C INTO ...
It is the rownum=1 that is causing your issue.
I have a parallel process that is using queue table in PostgreSQL. Logic is:
Begin transaction.
Mark 100 unprocessed records with some random generated ID.
Commit.
Run some heavy app logic that takes some time and is processing queue records with generated ID in step 2.
Update 100 processed records with success/bad status.
Up to 20 threads are doing those steps.
However, sometimes when I'm trying to do 2 step with query:
UPDATE QUEUE_TABLE
SET QUEUE_TXN_GUID=$RANDOM_GUID,
QUEUE_STATUS=1
WHERE QUEUE_ROW_GUID IN
(SELECT QUEUE_ROW_GUID from QUEUE_TABLE
WHERE QUEUE_STATUS IS NULL OR QUEUE_STATUS = -1
LIMIT 100 FOR UPDATE SKIP LOCKED) RETURNING QUEUE_ROW_GUID
I got error deadlock detected.
Query that I'm using in step 5 is
UPDATE QUEUE_TABLE SET CDC_QUEUE_REZ_STATUS=$STATUS WHERE CDC_QUEUE_REZ_TXN_GUID=$RANDOM_GUID;
I don't know why I'm getting this strange deadlock, with FOR UPDATE SKIP LOCKED in first update subquery.
The reason of the issue is the fact that there are duplicates in QUEUE_ROW_GUID. Select locks some rows but then query updates not those rows that were locked. That's why concurrently running query may try to update the same rows as this one. So the SKIP LOCKED does not work in this case.
Given that update of rows may happen in different order the first query (that tries to update say row 1 and row 2) may first update row 1 and then try to update row 2 but waits on lock. Concurrently running query (that tries to update 1 and 2 as well) already updated row 2 and waits for lock for row 1. Hence the deadlock.
You need to use unique identifiers to update rows after they are locked.
Suppose I have a table 100 rows, I just want to select top 10 rows of table, but my situation is that i want to select only those rows which was not previously processed.
For this i have added a Flag column so that i will update whenever i process rows.
But here the problem arises when concurrent request comes for top 10 rows. Both may get same rows and trying to update the same rows (which I dont want to do).
Here I can't use Begin Transaction because It will lock the table and concurrent request will not get handled.
Requirement : My actual requirement is When i am selecting top 10 rows
using flag condition and updating then, then if other request for the
same it will also select other top 10 rows which is not handling by
Request 1.
Example : My table contains 100 rows.
{
Select top 10 * from table_name where flag=0
update table_name set top 10 flag = 1
}
(Will select top 10 out of 100 rows n update)
if at the same time during above request, another request come,
{
Select top 10 * from table_name where flag=0 (Should skip previous request rows)
update table_name set top 10 flag = 1
}
Need: (Will select top 10 out of rest 90 rows n update)
I Need a lock on top 10 rows of first request, but lock should like skip rows of first request even during simultaneous select statement of both requests
Please help me out for this to solve.
You can use an OUTPUT clause to do both the selecting and the updating the flag in one statement, e.g.
UPDATE TOP 10 table
SET flag = 1
WHERE flag = 0
OUTPUT inserted.*
If I understand you correctly you don't want to use a Transaction because it will lock the table for the duration of the update.
Maybe you could split the process into one part which selects the rows and updates the flag and a second part where you actually do your update with the selected rows.
Use a Transaction only for the first part of the task. This will ensure the table is only locked for the absolute Minimum of time.
As for your non-repeatable reads:
If you really want enforce this policy you should delete the selected row from the table and optionally save them to another table where the read-history stays. The lowest-level way to accomplish this guaranteed is with an update of another flag (updated?) and a trigger after the update.
Transaction with ISOLATION LEVEL REPEATABLE READ
{
select top 10 rows
update select-flag
return the 10 rows
}
normal query
{
take the returned 10 rows and do something
change updated-flag
}
Trigger after update if updated-flag changed
{
copy updated to read-history-table
delete updated-rows
}
ISOLATION LEVELS on MSDN
REPEATABLE READ "Specifies that statements cannot read data that has
been modified but not yet committed by other transactions and that
no other transactions can modify data that has been read by the
current transaction until the current transaction completes."
I have an Oracle source, and I'm getting the entire table, and it is being copied to a SQL Server 2008 table that looks the same. Just for testing, I would like to only get a subset of the table.
In the old DTS packages, under Options on the data transform, I could set a first and last record number, and it would only get that many records.
If I were doing a query, I could change it to a select top 5000 or set rowcount 5000 at the top (maybe? This is an Oracle source). But I'm grabbing the entire table.
How do I limit the rowcount when selecting an Oracle table?
We can use the rowcount component in the dataflow and after the component make User::rowCount <= 500 in the precedence constraint condition while loding into the target. Whenever the count >500 the process stops to inserts the data into the target table.
thanks
prav
It's been a while since I've touched pl/sql, but I would think that you could simply put a where condition of "rownum <= n" where n = the number of rows that you want for your sample. ROWNUM is a pseudo-column that exists on each Oracle table . . . it's a handy feature for problems like this (it's equivalent to t-sql's row_number() function without the ability to partition and sort (I think). This would keep you from having to bring in the whole table into memory:
select col1, col2
from tableA
where rownum <= 10;
For future reference (and only because I've been working with it lately), DB2's equivalent for this is the clause "fetch first n only" at the end of the statement:
select col1, col2
from tableA
fetch first 10 only;
Hope I've not been too off base.
The row sampling component in the data flow restricts the number of rows. Just insert it between your source and destination and set the number of rows. Very useful for a large amount of data and when you can not modify the query. In this example, I execute an SP in the source.
See example below