Postgresql Concurrency

Postgresql Concurrency - sql

In a project that I'm working, there's a table with a "on update" trigger, that monitors if a boolean column has changed (ex.: false -> true = do some action). But this action can only be done once for a row.
There will be multiple clients accessing the database, so I can suppose that eventually, multiple clients will try to update the same row column in parallel.
Does the "update" trigger itself handle the concurrency itself, or I need to do it in a transaction and manually lock the table?

Triggers don't handle concurrency, and PostgreSQL should do the right thing whether or not you use explicit transactions.
PostgreSQL uses optimistic locking which means the first person to actually update the row gets a lock on that row. If a second person tries to update the row, their update statement waits to see if the first commits their change or rolls back.
If the first person commits, the second person gets an error, rather than their change going through and obliterating a change that might have been interesting to them.
If the first person rolls back, the second person's update un-blocks, and goes through normally, because now it's not going to overwrite anything.
The second person can also use the NOWAIT option, which makes the error happen immediately instead of blocking, if their update conflicts with an unresolved change.

Related

Lock the newly inserted row [PostgreSQL]

Is there a way to lock a recently inserted row so that other transaction do will not see it while my current transaction is still going on?

You don't have to. In fact, you can't have the opposite behaviour. There's no way to make your newly inserted row visible until your transaction commits.
Despite the fact that it's not visible to concurrent SELECT, it can still affect concurrent INSERT (or UPDATEs) though. Specifically, if you try to insert the same value into a unique index in two different transactions, one will block until the other commits or rolls back. Then it'll decide if it needs to raise a unique violation error, or if it can continue. So while you cannot directly see uncommitted data, sometimes you can see its side effects.

How can I lock a row, and use the lock across multiple transactions?

I have a situation where I need to:
Read the value of a row.
If the value of some column is 'X', perform action A. Otherwise, perform action B.
If we performed action A, update the column with the result of the action.
Action A is not a database operation, and may take a while to run, and it is not reversible. Action B is not a database operation, but is very fast to run. The sequence is performed on multiple threads, across multiple servers.
Currently we have no locking, and so occasionally we see action A being executed multiple times, when it should only happen once. I think my only solution here is to somehow wrap the sequence above with an acquire lock step and a release lock step, and I'm not sure how to do that.
I have seen a similar question, where the answer was to add 'locked' and 'acquiry time' columns to the row. However, in that situation, the OP wasn't worried about frequently re-acquiring the lock. If I had to spin-wait for the previous lock to expire every time I wanted to execute the sequence, my server's performance would probably go out the window.
Is there something built in to SQL that I can use here?

Update the "X" value to "pending".
On completion of action A, update "pending" to whatever.
No locking required.

Make a delete statement delete rows as soon as it find it, rather than keeping the table static until delete is finished

I'm wondering if there is a way to get a delete statement to remove rows as it is traversing a table. So where now a delete statement will find all the appropriate rows to delete and then delete them all once it has found them, I want it to find a row that meets the criteria for deletion and remove it immediately then continue, comparing the next rows with the new table that has entries removed.
I think this could be accomplished in a loop...maybe? But I feel like it would be horribly inefficient. Possibly something like, it will look for a row to delete, then once it finds a single row, it will delete, stop, and then go through for deletion again on the new table.
Any ideas?

A set-oriented environment like SQL usually requires this kind of thing to happen "all at once".
You might be able to use a SQL DELETE statement within a transaction to delete a single row, with that transaction wrapped in a stored procedure to handle the logic, but that would be kind of like kicking dead whales down the beach.
You need the transaction (a committed transaction, maybe a serializable transaction) to reliably "free up" values, and to reliably handle concurrency and race conditions.

SQL unique field: concurrency bugs? [duplicate]

This question already has answers here:
Only inserting a row if it's not already there
(7 answers)
Closed 9 years ago.
I have a DB table with a field that must be unique. Let's say the table is called "Table1" and the unique field is called "Field1".
I plan on implementing this by performing a SELECT to see if any Table1 records exist where Field1 = #valueForField1, and only updating or inserting if no such records exist.
The problem is, how do I know there isn't a race condition here? If two users both click Save on the form that writes to Table1 (at almost the exact same time), and they have identical values for Field1, isn't it possible that the following would happen?
User1 makes a SQL call, which performs the select operation and determines there are no existing records where Field1 = #valueForField1. User1's process is preempted by User2's process, which also finds no records where Field1 = #valueForField1, and performs an insert. User1's process is allowed to run again, and inserts a second record where Field1 = #valueForField1, violating the requirement that Field1 be unique.
How can I prevent this? I'm told that transactions are atomic, but then why do we need table locks too? I've never used a lock before and I don't know whether or not I need one in this case. What happens if a process tries to write to a locked table? Will it block and try again?
I'm using MS SQL 2008R2.

Add a unique constraint on the field. That way you won't have to SELECT. You will only have to insert. The first user will succeed the second will fail.
On top of that you may make the field autoincremented, so you won't have to care on filling it, or you may add a default value, again not caring on filling it.
Some options would be an autoincremented INT field, or a unique identifier.

You can add a add a unique constraint. Example from http://www.w3schools.com/sql/sql_unique.asp:
CREATE TABLE Persons
(
P_Id int NOT NULL UNIQUE
)

EDIT: Please also read Martin Smith's comment below.
jyparask has a good answer on how you can tackle this specific problem. However, I would like to elaborate on your confusion over locks, transactions, blocking, and retries. For the sake of simplicity, I'm going to assume transaction isolation level serializable.
Transactions are atomic. The database guarantees that if you have two transactions, then all operations in one transaction occur completely before the next one starts, no matter what kind of race conditions there are. Even if two users access the same row at the same time (multiple cores), there is no chance of a race condition, because the database will ensure that one of them will fail.
How does the database do this? With locks. When you select a row, SQL Server will lock the row, so that all other clients will block when requesting that row. Block means that their query is paused until that row is unlocked.
The database actually has a couple of things it can lock. It can lock the row, or the table, or somewhere in between. The database decides what it thinks is best, and it's usually pretty good at it.
There is never any retrying. The database will never retry a query for you. You need to explicitly tell it to retry a query. The reason is because the correct behavior is hard to define. Should a query retry with the exact same parameters? Or should something be modified? Is it still safe to retry the query? It's much safer for the database to simply throw an exception and let you handle it.
Let's address your example. Assuming you use transactions correctly and do the right query (Martin Smith linked to a few good solutions), then the database will create the right locks so that the race condition disappears. One user will succeed, and the other will fail. In this case, there is no blocking, and no retrying.
In the general case with transactions, however, there will be blocking, and you get to implement the retrying.

difference before and after trigger in oracle

Can somebody explain difference between "before" and "after" trigger in oracle 10g with an example ?

First, I'll start my answer by defining trigger: a trigger is an stored procedure that is run when a row is added, modified or deleted.
Triggers can run BEFORE the action is taken or AFTER the action is taken.
BEFORE triggers are usually used when validation needs to take place before accepting the change. They run before any change is made to the database. Let's say you run a database for a bank. You have a table accounts and a table transactions. If a user makes a withdrawal from his account, you would want to make sure that the user has enough credits in his account for his withdrawal. The BEFORE trigger will allow to do that and prevent the row from being inserted in transactions if the balance in accounts is not enough.
AFTER triggers are usually used when information needs to be updated in a separate table due to a change. They run after changes have been made to the database (not necessarily committed). Let's go back to our back example. After a successful transaction, you would want balance to be updated in the accounts table. An AFTER trigger will allow you to do exactly that.

I'm not completely sure what you're interested in knowing, so I'll keep this fundamental.
Before Triggers
As per the name, these triggers are fired prior to creating the row in the table. Subsequently, since the row has not yet been created you have full access to the :new.table_element field. This allows for data cleansing and uniformity if unwanted/malformed data is attempting to be inserted/updated. This is just a basic example, but you need to utilize the before trigger any time you may require access to the ":new" data.
After Triggers
Since the after trigger fires once the row has already been created, these triggers are typically utilized when you want logic to occur due to the row. For example, if you have an address table and a user updates his/her address, then you may want to update the address reference ids in an xref table upon creation (if you happen to be retaining all old addresses as well). Also, unlike the before trigger, you do not have access to modify any of the column values since the row already exists in the table.

BEFORE TRIGGER are used when the trigger action should determine whether or not the triggering statements should be allowed to complete .by using BEFORE TRIGGERS user can eliminate unnecessary processing of the triggering statement
but,AFTER TRIGGERS are used when the triggering statements should completed before executing the trigger action.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas