SQL Concurrent transactions with SELECT - sql

i have transaction which looks like this:
BEGIN;
SELECT last_table_row;
...some long queries with last_table_row data...
DELETE selected_last_table_row;
COMMIT;
if i run this transaction twice at same time second one will use same last_table_row item as first transaction (tried from psql console) which is not desired. How should i handle this type of transaction to be safe and not having interfering transactions?

Use SELECT FOR UPDATE.
If FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE or FOR KEY SHARE is specified, the SELECT statement locks the selected rows against concurrent updates.
Here is the documentation on The Locking Clause.

As well as SELECT FOR UPDATE, considering you'll remove that row at the end of the transaction, you may as well remove it at the beginning copying into a temporary table:
BEGIN;
CREATE TEMP TABLE last_table_row ON COMMIT DROP AS
WITH ltr AS (
DELETE FROM yourtable
WHERE id = (SELECT MAX(id) FROM yourtable)
RETURNING *)
SELECT * FROM ltr;
...some long queries with last_table_row data...
COMMIT;

Related

which delete statement is better for deleting millions of rows

I have table which contains millions of rows.
I want to delete all the data which is over a week old based on the value of column last_updated.
so here are my two queries,
Approach 1:
Delete from A where to_date(last_updated,''yyyy-mm-dd'')< sysdate-7;
Approach 2:
l_lastupdated varchar2(255) := to_char(sysdate-nvl(p_days,7),'YYYY-MM-DD');
insert into B(ID) select ID from A where LASTUPDATED < l_lastupdated;
delete from A where id in (select id from B);
which one is better considering performance, safety and locking?
Assuming the delete removes a significant fraction of the data & millions of rows, approach three:
create table tmp
Delete from A where to_date(last_updated,''yyyy-mm-dd'')< sysdate-7;
drop table a;
rename tmp to a;
https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2345591157689
Obviously you'll need to copy over all the indexes, grants, etc. But online redefinition can help with this https://oracle-base.com/articles/11g/online-table-redefinition-enhancements-11gr1
When you get to 12.2, there's another simpler option: a filtered move.
This is an alter table move operation, with an extra clause stating which rows you want to keep:
create table t (
c1 int
);
insert into t values ( 1 );
insert into t values ( 2 );
commit;
alter table t
move including rows where c1 > 1;
select * from t;
C1
2
While you're waiting to upgrade to 12.2+ and if you don't want to use the create-as-select method for some reason then approach 1 is superior:
Both methods delete the same rows from A* => it's the same amount of work to do the delete
Option 1 has one statement; Option 2 has two statements; 2 > 1 => option 2 is more work
*Statement level consistency means you might get different results running the processes. Say another session tries to update an old row that your process will remove.
With just the delete, the update will be blocked until the delete finishes. At which point the row's gone, so the update does nothing.
Whereas if you do the insert first, the other session can update & commit the row before the insert completes. So the update "succeeds". But the delete will then remove it! Which can lead to some unhappy customers...
Your stored dateformat seems suitable for proper sorting, so you could go the other way round and convert sysdate to string:
--this is false today
select * from dual where '2019-06-05' < to_char(sysdate-7, 'YYYY-MM-DD');
--this is true today
select * from dual where '2019-05-05' < to_char(sysdate-7, 'YYYY-MM-DD');
So it would be:
Delete from A where last_updated < to_char(sysdate-7, ''yyyy-mm-dd'');
It has the benefit that your default index (if there is any) will be used.
It has the disadvantage on relying on the String/Varchar ordering which might be changed i.e. bei NLS changes (if i remember right), so in any case you should do a little testing before...
In the long term, you should of cource alter the colum to a proper date-datatype, but I guess that doesn't help you right now ;)
If you are trying to delete most of the rows in the table, I would advise you go with a different approach, namely:
create <new table name> as
select *
from <old table name>
where <predicates for the data you want to keep>;
then
drop table <old table name>;
and finally you can rename the new table back to the old table.
You could always partition the new table (i.e. create the new table with a separate statement containing the partitioning clauses, and then have an insert as select into the new table from the old table).
That way, when you need to delete rows, it's a simple matter of dropping the relevant partition(s).

Copy a table data from one database to another database SQL

I have had a look at similar problems, however none of the answers helped in my case.
Just a little bit of background. I have Two databases, both have the same table with the same fields and structure. Data already exists in both tables. I want to overwrite and add to the data in db1.table from db2.table the primary ID is causing a problem with the update.
When I use the query:
USE db1;
INSERT INTO db2.table(field_id,field1,field2)
SELECT table.field_id,table.field1,table.field2
FROM table;
It works to a blank table, because none of the primary keys exist. As soon as the primary key exists it fails.
Would it be easier for me to overwrite the primary keys? or find the primary key and update the fields related to the field_id? Im really not sure how to go ahead from here. The data needs to be migrated every 5min, so possibly a stored procedure is required?
first you should try to add new records then update all records.you can create a procedure like below code
PROCEDURE sync_Data(a IN NUMBER ) IS
BEGIN
insert into db2.table
select *
from db1.table t
where t.field_id not in (select tt.field_id from db2.table tt);
begin
for t in (select * from db1.table) loop
update db2.table aa
set aa.field1 = t.field1,
aa.field2 = t.field2
where aa.field_id = t.field_id;
end loop;
end;
END sync_Data
Set IsIdentity to No in Identity Specification on the table in which you want to move data, and after executing your script, set it to Yes again
I ended up just removing the data in the new database and sending it again.
DELETE FROM db2.table WHERE db2.table.field_id != 0;
USE db1;
INSERT INTO db2.table(field_id,field1,field2)
SELECT table.field_id,table.field1,table.field2
FROM table;
Its not very efficient, but gets the job done. I couldnt figure out the syntax to correctly do an UPDATE or to change the IsIdentity field within MariaDB, so im not sure if they would work or not.
The overhead of deleting and replacing non-trivial amounts of data for an entire table will be prohibitive. That said I'd prefer to update in place (merge) over delete /replace.
USE db1;
INSERT INTO db2.table(field_id,field1,field2)
SELECT t.field_id,t.field1,t.field2
FROM table t
ON DUPLICATE KEY UPDATE field1 = t.field1, field2 = t.field2
This can be used inside a procedure and called every 5 minutes (not recommended) or you could build a trigger that fires on INSERT and UPDATE to keep the tables in sync.
INSERT INTO database1.tabledata SELECT * FROM database2.tabledata;
But you have to keep length of varchar length larger or equal to database2 and keep the same column name

Does u-sql script executes in sequence?

I am supposed to do incremental load and using below structure.
Do the statements execute in sequence i.e. TRUNCATE is never executed before first two statements which are getting data:
#newData = Extract ... (FROM FILE STREAM)
#existingData = SELECT * FROM dbo.TableA //this is ADLA table
#allData = SELECT * FROM #newData UNION ALL SELECT * FROM #existingData
TRUNCATE TABLE dbo.TableA;
INSERT INTO dbo.TableA SELECT * FROM #allData
To be very clear: U-SQL scripts are not executed statement by statement. Instead it groups the DDL/DML/OUTPUT statements in order and the query expressions are just subtrees to the inserts and outputs. But first it binds the data during compilation to their names, so your SELECT from TableA will be bound to the data (kind of like a light-weight snapshot), so even if the truncate is executed before the select, you should still be able to read the data from table A (note that permission changes may impact that).
Also, if your script fails during the execution phase, you should have an atomic execution. That means if your INSERT fails, the TRUNCATE should be undone at the end.
Having said that, why don't you use INSERT incrementally and use ALTER TABLE REBUILD periodically instead of doing the above pattern that reads the full table on every insertion?

Can an updated with nested select be considered atomic in Sybase?

I am trying something like this:
set rowcount 10 //fetch only 10 row
Update tableX set x=#BatchId where id in (select id from tableX where x=0)
basically mark 10 record as booked by supplying a batchId.
So my question is if this proc is executed in parallel then can I guarantee that update with select will be atomic and no invocation will select similar setof record from tableX for booking?
Thanks
To guarantee that no such overlaps occur, yo should:
(i) put BEGIN TRANSACTION - COMMIT around the statement
(ii) put the HOLDLOCK keyword directly behind 'tableX' (or run the whole statement at isolation level 3).

MS SQL locking for update

I am implementing a competition where there might be a lot of simultaneous entries. I am collecting some user data, which I am putting in one table called entries. I have another table of pre-generated unique discount codes in a table called discountCodes. I am then assigning one to each entry. I thought I would do this by putting a entry id in the discountCodes table.
As there may be a lot of concurrent users I think I should select the first unassigned row and then assign the entry id to that row. I need to make sure between picking an unassigned row and adding the entry id that another thread doesn't find the same row.
What is the best way of ensuring that the row doesn't get assigned twice?
Something can be done like
The following example sets the TRANSACTION ISOLATION LEVEL for the session. For each Transact-SQL statement that follows, SQL Server holds all of the shared locks until the end of the transaction. Source:MSDN
USE databaseName;
GO
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
GO
BEGIN TRANSACTION;
GO
SELECT *
FROM Table1;
GO
SELECT *
FROM Table2;
GO
COMMIT TRANSACTION;
GO
Read more SET TRANSACTION ISOLATION LEVEL
I would recommend building a bridge table instead of having the EntryId in the DiscountCodes table with an EntryId and a DiscountCodeId. Place a Unique Constraint on both of those fields.
This way your entry point will encounter a constraint violation when it tries to enter a duplicate.
WITH e AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) rn
FROM entries ei
WHERE NOT EXISTS
(
SELECT NULL
FROM discountCodes dci
WHERE dci.entryId = ei.id
)
),
dc AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) rn
FROM discountCodes
WHERE entryId IS NULL
)
UPDATE dc
SET dc.entryId = e.id
FROM e
JOIN dc
ON dc.rn = e.rn
I would just put an IDENTITY field on each table and let the corresponding entry match the corresponding discountCode - i.e. if you have a thousand discountCodes up front, your identity column in the discountCodes table will range from 1 to 1000. That will match your first 1 to 1000 entries. If you get more than 1000 entries, just add one discountCode per additional entry.
That way SQL Server handles all the problematic "get the next number in the sequence" logic for you.
You can try and use sp_getapplock and synchronize the write operation, just make sure it locks against the same hash, like
DECLARE #state Int
BEGIN TRAN
-- here we're using 1 sec's as time out, but you should determine what the min value is for your instance
EXEC #state = sp_getapplock 'SyncIt', 'Exclusive', 'Transaction', 1000
-- do insert/update/etc...
-- if you like you can be a little verbose and explicit, otherwise line below shouldn't be needed
EXEC sp_releaseapplock 'SyncIt', 'Transaction'
COMMIT TRAN