oracle select & update or select & insert in multithreaded env

oracle select & update or select & insert in multithreaded env - sql

I have a problem and I have the impression that the solution is simple but I cannot figure it out.
A have a multi threaded env and a pl sql stored procedure.
Inside this procedure I have something like that :
select * into mycount from toto;
If mycount >0 then update...;
else insert ...;
The problem is that I have many threads calling this procedure.
Is there a simple way to have only one thread at a time executing the piece of code above ?
I know that I can use select for update but since I could have an UPDATE or INSERT I guess this does not works for me.
Thanks alot.

Have a separate table MODIFY_CHECKER with just one column FLAG. Use this table & column as a means to allow only one thread to update/insert on your actual table (TOTO I suppose)
You could add something like the below to your existing PL/SQL procedure -
IF (select count(1) from modify_checker where flag = 1) > 0 THEN
-- Another thread is already working, so just raise exception
RAISE <<exception>>
ELSE
-- No other thread working on this, so go ahead
UPDATE modify_checker SET flag = 1;
COMMIT;
<<actual code to update or insert actual table>>
UPDATE modify_checker SET flag = 0;
COMMIT;
END IF;

The simplest solution for what you want would be to introduce an intermediary queue of some sort at the application level - the various threads would send a request to the queue and a queue processor would read requests off of the queue and make the necessary call to the DB.
This would, in my opinion, be the easiest solution in that you don't have to change too much to get it to work. The only potential problem is that this solution could be complicated if a response is necessary - if so this is still an option, but the app code is complicated a bit by having to deal with asynchronous responses.

You can put a lock as soon as you access the procedure and the release it when you are exiting from it. This way only one can access this at a time.
ALTER TABLE tl_test ENABLE TABLE LOCK;
ALTER TABLE tl_test DISABLE TABLE LOCK;

Related

Getting newest not handled, updated rows using txid

I have a table in my PostgreSQL (actually its mulitple tables, but for the sake of simplicity lets assume its just one) and multiple clients that periodically need to query the table to find changed items. These are updated or inserted items (deleted items are handled by first marking them for deletion and then actually deleting them after a grace period).
Now the obvious solution would be to just keep a “modified” timestamp column for each row, remember it for each each client and then simply fetch the changed ones
SELECT * FROM the_table WHERE modified > saved_modified_timestamp;
The modified column would then be kept up to date using triggers like:
CREATE FUNCTION update_timestamp()
RETURNS trigger
LANGUAGE ‘plpgsql’
AS $$
BEGIN
NEW.modified = NOW();
RETURN NEW;
END;
$$;
CREATE TRIGGER update_timestamp_update
BEFORE UPDATE ON the_table
FOR EACH ROW EXECUTE PROCEDURE update_timestamp();
CREATE TRIGGER update_timestamp_insert
BEFORE INSERT ON the_table
FOR EACH ROW EXECUTE PROCEDURE update_timestamp();
The obvious problem here is that NOW() is the time the transation started. So it might happen that a transaction is not yet commited while fetching the updated rows and when its commited, the timestamp is lower than the saved_modified_timestamp, so the update is never registered.
I think I found a solution that would work and I wanted to see if you can find any flaws with this approach.
The basic idea is to use xmin (or rather txid_current()) instead of the timestamp and then when fetching the changes, wrap them in an explicit transaction with REPEATABLE READ and read txid_snapshot() (or rather the three values it contains txid_snapshot_xmin(), txid_snapshot_xmax(), txid_snapshot_xip()) from the transaction.
If I read the postgres documentation correctly, then all changes made transactions that are < txid_snapshot_xmax() and not in txid_snapshot_xip() should be returned in that fetch transaction. This information should then be all that is required to get all the update rows when fetching again. The select would then look like this, with xmin_version replacing the modified column:
SELECT * FROM the_table WHERE
xmin_version >= last_fetch_txid_snapshot_xmax OR xmin_version IN last_fetch_txid_snapshot_xip;
The triggers would then be simply like this:
CREATE FUNCTION update_xmin_version()
RETURNS trigger
LANGUAGE ‘plpgsql’
AS $$
BEGIN
NEW.xmin_version = txid_current();
RETURN NEW;
END;
$$;
CREATE TRIGGER update_timestamp_update
BEFORE UPDATE ON the_table
FOR EACH ROW EXECUTE PROCEDURE update_xmin_version();
CREATE TRIGGER update_timestamp_update_insert
BEFORE INSERT ON the_table
FOR EACH ROW EXECUTE PROCEDURE update_xmin_version();
Would this work? Or am I missing something?

Thank you for the clarification about the 64-bit return from txid_current() and how the epoch rolls over. I am sorry I confused that epoch counter with the time epoch.
I cannot see any flaw in your approach but would verify through experimentation that having multiple client sessions concurrently in repeatable read transactions taking the txid_snapshot_xip() snapshot does not cause any problems.
I would not use this method in practice because I assume that the client code will need to solve for handling duplicate reads of the same change (insert/update/delete) as well as periodic reconciliations between the database contents and the client's working set to handle drift due to communication failures or client crashes. Once that code is written, then using now() in the client tracking table, clock_timestamp() in the triggers, and a grace interval overlap when the client pulls changesets would work for use cases I have encountered.
If requirements called for stronger real-time integrity than that, then I would recommend a distributed commit strategy.

Ok, so I've tested it in depth now and haven't found any flaws so far. I had around 30 clients writing and reading to the database at the same time and all of them got consistent updates. So I guess this approach works.

How can I update multiple items from a select query in Postgres?

I'm using node.js, node-postgres and Postgres to put together a script to process quite a lot of data from a table. I'm using the cluster module as well, so I'm not stuck with a single thread.
I don't want one of the child processes in the cluster duplicating the processing of another. How can I update the rows I just received from a select query without the possibility of another process or query having also selected the same rows?
I'm assuming my SQL query will look something like:
BEGIN;
SELECT * FROM mytable WHERE ... LIMIT 100;
UPDATE mytable SET status = 'processing' WHERE ...;
COMMIT;
Apologies for my poor knowledge of Postgres and SQL, I've used it once before in a simple PHP web app and never before with node.js.

If you're using multithreaded application you cannot and should not be using "for Update" (in the main thread anyway) what you need to be using is advisory lock. Each thread can query a row or mnany rows, verifying that they're not locked, and then locking them so no other session uses them. It's as simple as this within each thread:
select * from mytab
where pg_try_advisory_lock(mytab.id)
limit 100
at the end be sure to release the locks using pg_advisory_unlock

BEGIN;
UPDATE mytable SET status = 'processing' WHERE status <> "processing" and id in
( selecy ID FROM mytable where status <> "processing" limit 100) returning * ;
COMMIT;
There's a chance that's going to fail if some other query was working on the same rows
so if you get an error, retry it until you get some data or no rows returned.
if you get zero rows either you're finished or there;s too many other simultaneous proceses like yours.

Identify the action that is deleting all rows in a table

There is SQL Server 2012 database that is used by three different applications. In that database there is a table that contains ~500k rows and for some mysterious reason this table gets emptied every now and then. I think this is possibly caused by:
A delete query without a where clause
A delete query in a loop gone wild
I am trying to locate the cause of this issue by reviewing code but no joy. I need an alternate strategy. I think I can use triggers to detect what/why all rows get deleted but I am not sure how to go about this. So:
Can I use triggers to check if a query is attempting to delete all rows?
Can I use triggers to log the problematic query and the application that issues that query?
Can I use triggers to log such actions into a text file/database table/email?
Is there a better way?

You can use Extended Events to monitor your system.
Here a simple screen shot where are.
A simple policy can monitor for delete and truncate statements.
When this events are raised details are written into file.
Here a screen with details (you can configure the script to collect more data) collected for delete statement.
Here the script used, modify the output file path
CREATE EVENT SESSION [CheckDelete] ON SERVER
ADD EVENT sqlserver.sql_statement_completed(SET collect_statement=(1)
ACTION(sqlserver.client_connection_id,sqlserver.client_hostname)
WHERE ([sqlserver].[like_i_sql_unicode_string]([statement],N'%delete%') OR [sqlserver].[like_i_sql_unicode_string]([statement],N'%truncate%')))
ADD TARGET package0.event_file(SET filename=N'C:\temp\CheckDelete.xel',max_file_size=(50))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO

This is a possibility that may help you. It creates a trigger on Table1 that sends an email when a process DELETEs more than 100 records. I'd modify the message to include some useful data like:
Process ID (##SPID)
Host (HOST_NAME())
Name of app (APP_NAME())
And possibly the entire query
CREATE TRIGGER Table1MassDeleteTrigger
ON dbo.Activities
FOR DELETE
AS
DECLARE #DeleteCount INT = (SELECT COUNT(*) FROM deleted)
IF(#DeleteCount > 100)
EXEC msdb.dbo.sp_send_dbmail
#profile_name = 'MailProfileName',
#recipients = 'admin#yourcompany.com',
#body = 'Something is deleting all your data!',
#subject = 'Oops!';

Debug Insert and temporal tables in SQL 2012

I'm using SQL Server 2012, and I'm debugging a store procedure that do some INSERT INTO #temporal table SELECT.
There is any way to view the data selected in the command (the subquery of the insert into?)
There is any way to view the data inserted and/or the temporal table where the insert maked the changes?
It doesn't matter if is the total rows, not one by one
UPDATE:
Requirements from AT Compliance and Company Policy requires that any modification can be done in the process of test and it's probable this will be managed by another team. There is any way to avoid any change on the script?
The main idea is that the AT user check in their workdesktop the outputs, copy and paste them, without make any change on environment or product.
Thanks and kind regards.

If I understand your question correctly, then take a look at the OUTPUT clause:
Returns information from, or expressions based on, each row affected
by an INSERT, UPDATE, DELETE, or MERGE statement. These results can be
returned to the processing application for use in such things as
confirmation messages, archiving, and other such application
requirements.
For instance:
INSERT INTO #temporaltable
OUTPUT inserted.*
SELECT *
FROM ...
Will give you all the rows from the INSERT statement that was inserted into the temporal table, which were selected from the other table.

Is there any reason you can't just do this: SELECT * FROM #temporal? (And debug it in SQL Server Management Studio, passing in the same parameters your application is passing in).
It's a quick and dirty way of doing it, but one reason you might want to do it this way over the other (cleaner/better) answer, is that you get a bit more control here. And, if you're in a situation where you have multiple inserts to your temp table (hopefully you aren't), you can just do a single select to see all of the inserted rows at once.
I would still probably do it the other way though (now I know about it).

I know of no way to do this without changing the script. Howeer, for the future, you should never write a complex strored proc or script without a debug parameter that allows you to put in the data tests you will want. Make it the last parameter with a default value of 0 and you won't even have to change your current code that calls the proc.
Then you can add statements like the below everywhere you will want to check intermediate results. Further in debug mode you might always rollback any transactions so that a bug will not affect the data.
IF #debug = 1
BEGIN
SELECT * FROM #temp
END

PL/SQL Dynamic Table Names

I'm pretty new to Oracle so not entirely sure this is possible, or if perhaps I'm going about it the wrong way but here goes ...
Part of an old feeder script I'm fixing is looping through ~ 20 tables (could change anytime) to populate relevant staging tables. This part is currently very basic:
...
INSERT INTO staging_tbl_1(
SELECT *
FROM source_tbl_1
);
INSERT INTO staging_tbl_2(
SELECT *
FROM source_tbl_2
);
...
Some of the fields in the source database have different constraints etc which means that every now and then it will throw an exception and the feeder will stop. What I'm hoping to do is create a procedure within the existing feeder package to loop through each row in each record before it is inserted and simply wrap it in an exception block. This way it can be logged without causing the feeder to stop.
Essentially I'm chasing something like this:
BEGIN procedure_x(source_record, staging_record)
-- Perform validation to ensure records exit
-- Loop through all record rows
FOR row IN (SELECT * FROM source_record) LOOP
-- Wrap in exception block
-- Insert into staging record
-- Log exception if it occurs
END LOOP;
END
I've attempted ref cursors however in order to get them to work I would also need to know the rowtype in advance (from my limited understanding). I've also tried execute immediate however I cannot find a way to loop this in an appropriate way. Are there any other ways to tackle this?
Additional:
I realise that we really should be fixing the source of the problem rather than going about it like this, unfortunately it is far outside my area of influence.
It is possible to do this without making a separate procedure and just wrap all of the table references in a loop, however I'd like to leave this as a last resort.

Oracle has functionality for logging of DML errors. Use it with single SQL statements. Don't go row-by-row and make your processes crawl.
http://docs.oracle.com/cd/B19306_01/server.102/b14231/tables.htm#ADMIN10261

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

oracle select & update or select & insert in multithreaded env - sql

You can put a lock as soon as you access the procedure and the release it when you are exiting from it. This way only one can access this at a time. ALTER TABLE tl_test ENABLE TABLE LOCK; ALTER TABLE tl_test DISABLE TABLE LOCK;

Related

Getting newest not handled, updated rows using txid

How can I update multiple items from a select query in Postgres?

Identify the action that is deleting all rows in a table

Debug Insert and temporal tables in SQL 2012

PL/SQL Dynamic Table Names

Categories

Resources