Locking a specific row in postgres - sql

I'm new enough to Postgres, and I'm trying to figure out how to lock a specific row of a table.
As an example, I have a table of users:
Name: John, Money: 1
Name: Jack, Money: 2
In my backend, I want to select John and make sure that no other calls can update (or even select possibly) John's row until my transaction is complete.
I think I need an exclusive lock from what I've read up online? I can't seem to find a good example of locking just 1 row from a table online, any idea?
Edit - Should I be doing it at method level like #SqlUpdate (or some form of that - using org.skife.jdbi.v2) or in the query itself?

If you want to lock the table in a specific selected row you need to LOCK FIRST them use the FOR UPDATE / FOR SHARE statement.
For example, in your case if you need to lock the first row you do this:
BEGIN;
LOCK TABLE person IN ROW EXCLUSIVE MODE;
-- BLOCK 1
SELECT * FROM person WHERE name = 'John' and money = 1 FOR UPDATE;
-- BLOCK 2
UPDATE person set name = 'John 2' WHERE name = 'John' and money = 1;
END;
In the BLOCK1 before the SELECT statement you are doing nothing only telling the database "Hey, I will do something in this table, so when I do, lock this table in this mode". You can select / update / delete any row.
But in BLOCK2 when you use the FOR UPDATE you lock that row to other transactions to specific modes(read the doc for more details). Will be locked until that transaction ends.
If you need a example do a test and try to do another SELECT ... FOR UPDATE in BLOCK2 before end the first transaction. It will be waiting the first transaction to end and will select right after it.
Only an ACCESS EXCLUSIVE lock blocks a SELECT (without FOR
UPDATE/SHARE) statement.
I am using it in a function to control subsequences and it is great. Hope you enjoy.

As soon as you update (and not commit) the row, no other transaction will be able to update that row.
If you want to lock the row before doing the update (which seems useless), you can do so using select ... for update.
You can not prevent other sessions from reading that row, and frankly that doesn't make sense either.
Even if your transaction hasn't finished (=committed) other sessions will not see any intermediate (inconsistent) values - they will see the state of the database as it was before your transaction started. That's the whole point of having a relational database that supports transactions.

You can use
LOCK TABLE table IN ACCESS EXCLUSIVE MODE;
when you are ready to read from your table. "SELECT" and all other operations will be queued until the end of the transaction (commit changes or rollback).
Note that this will lock the entire table and referring to PostgreSQL there is no table level lock that can lock exclusively a specific row.
So you can use
FOR UPDATE
row level lock in all your SELECT that will update your row and this will prevent all those SELECT that will update a row from reading your row !
PostgreSQL Documentation :
FOR UPDATE causes the rows retrieved by the SELECT statement to be locked as though for update. This prevents them from being locked, modified or deleted by other transactions until the current transaction ends. That is, other transactions that attempt UPDATE, DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE or SELECT FOR KEY SHARE of these rows will be blocked until the current transaction ends; conversely, SELECT FOR UPDATE will wait for a concurrent transaction that has run any of those commands on the same row, and will then lock and return the updated row (or no row, if the row was deleted). Within a REPEATABLE READ or SERIALIZABLE transaction, however, an error will be thrown if a row to be locked has changed since the transaction started. For further discussion see Section 13.4.
The FOR UPDATE lock mode is also acquired by any DELETE on a row, and also by an UPDATE that modifies the values on certain columns. Currently, the set of columns considered for the UPDATE case are those that have a unique index on them that can be used in a foreign key (so partial indexes and expressional indexes are not considered), but this may change in the future.*

I'm using my own table, my table name is paid_properties and it has two columns user_id and counter.
As you want one transaction at a time, so you can use one of the following locks:
FOR UPDATE mode assumes a total change (or delete) of a row.
FOR NO KEY UPDATE mode assumes a change only to the fields that are not involved in unique indexes (in other words, this change does not affect foreign keys).
The UPDATE command itself selects the minimum appropriate locking mode; rows are usually locked in the FOR NO KEY UPDATE mode.
To test it run following query in one tab (I'm using pgadmin4):
BEGIN;
SELECT * FROM paid_properties WHERE user_id = 37 LIMIT 1 FOR NO KEY UPDATE;
SELECT pg_sleep(60);
UPDATE paid_properties set counter = 4 where user_id = 37;
-- ROLLBACK; -- If you want to discard the operations you did above
END;
And the following query in another tab:
UPDATE paid_properties set counter = counter + 90 where user_id = 37;
You'll see that you're the second query will not be executed until the first one finishes and you'll have an answer of 94 which is correct in my case.
For more information:
https://postgrespro.com/blog/pgsql/5968005
https://www.postgresql.org/docs/current/explicit-locking.html
Hope this is helpful

Related

SQL Server statement explain

Please explain what the following statements mean. It's an assignment of local variables but I do not understand what inserted or deleted means?
select #ID = ID from inserted
select #co_ID = co_ID from deleted
Thank you
INSERTED and DELETED are temporary, memory resident tables created by SQL Server for use (or misuse) within a DML trigger.
Inserts and updates copy new rows into INSERTED,
Deletes and updates copy old rows into DELETED.
It looks like this code is attempting to audit a change to a row of data - but will fail unless there is something else in the code path guaranteeing that only a single row will be updated.
These statements mean that you have written a trigger in SQL Server that is not safe. The trigger assumes that only one row has been updated. This is not safe because SQL Server calls triggers based on groups of rows.
If there is one row, then the parameters #ID and #co_ID are assigned values from that row. If there are multiple rows being updated, then values from arbitrary -- and perhaps different -- rows are assigned to the parameters.

PostgreSQL: find out if a row has been created by the current transaction

Is it possible to find out if a row in a table has been created by the current transaction (and therefore is not yet visible for other transactions, because the current transaction is still active)?
My use case: I am adding event logging to the database. This is done in plpgsql triggers. A row in the event table looks like this: (event id:serial, event action:text, count:integer:default 1).
Now, the reasoning behind my question: If a certain row has been created by this transaction (most likely in another trigger), I could increment the count instead of creating a new row in the event table.
You could just look for logging entries like this:
SELECT ...
FROM tablename
WHERE xmin = current_txid() % (2^32)::bigint;
That will find all rows added or modified in the current transaction.
The downside is that this will force a sequential scan of the whole table, and you cannot avoid that since you cannot have an index on a system column.
So you could add an extra column xid to your table that is filled with txid_current()::bigint whenever a row is inserted or updated. Such a column can be indexed and efficiently used in a search:
SELECT ...
FROM tablename
WHERE xid = current_txid();
You might consider something like this:
create table ConnectionCurrentAction (
connectionID int primary key,
currentActionID uuid
)
then at the beginning of the transaction:
delete ConnectionCurrentAction where connectionID = pg_backend_pid()
insert ConnectionCurrentAction(connectionID, currentActionID)
select pg_backend_pid(), uuid_generate_v4()
You can wrap this in a proc called say, audit_action_begin
Note: You may instead choose to enforce the requirement that an "action" be created explicitly by removing the delete here.
At the end of a transaction, do audit_action_end:
delete ConnectionCurrentAction where connectionID = pg_backend_pid()
Whenever you want to know the current transaction:
(select currentActionID from ConnectionCurrentAction where connectionID - pg_backend_pid()(
You can wrap that in a function audit_action_current()
You can then put the currentActionID into your log which will enable you to identify whether a row was created in the current action or not. This will also allow you to identify where rows in different audit tables were created in the current logical action.
If you don't want to use a uuid a sequence would do just as well here. I like uuids.

Protect against parallel transaction updating row

I'm building a simple set of queries for an SQL database. I've come across a situation I want to protect against, but I don't know the database theory terminology to explain what I'm asking.
In this example I have two simultaneous transactions occurring on a database. Transaction #1 begins and Transaction #2 begins after T1 but T2 ends before T1 does its commit.
The table USERS has columns id, name, passwordHash
--Transaction #1
BEGIN TRANSACTION;
SELECT id from USERS where name = someName;
--do some work, transaction #2 starts and completes quickly while this work is being performed
UPDATE USERS SET name = newName where id = $id;
COMMIT;
--Transaction #2
BEGIN TRANSACTION;
SELECT id from USERS where name = someName;
UPDATE USERS SET passwordHash = newPasswordHash where id = $id;
COMMIT;
I would like to have some kind of safety check performed where by if I am updating a row, I am only updating the same version of that row that existed at the time the transaction started.
In this case, I would like the Transaction 1 COMMIT to fail because Transaction 2 has already updated the row belonging to user with name someName.
You can use SELECT FOR UPDATE with NOWAIT to lock the rows against concurrent modifications by other transactions. That will guarantee that your later updates will run against the same version of those rows; other transaction cannot change these rows until your transaction commits.
Example (using Postgresql):
Transaction 1:
begin transaction;
select * from users where username = 'Fabian' for update nowait;
update users set passwordHash = '123' where username = 'Fabian';
commit;
Transaction 2, somewhere after transaction 1 has selected for update, but not committed:
> select * from users where username = 'Fabian' for update nowait;
ERROR: could not obtain lock on row in relation "users"
Edit
This is usually called pessimistic locking. The transaction that first selects the row will "win", any later select for update will fail. If you want that the transaction wins that first writes a change, you might want to go for an optimistic locking approach, as proposed by #Laurence.
The standard way to do this is to add a rowversion column to the table. You read this column along with the rest of the data. When you submit an update, you include it in the where clause. You can then check the number of rows affected to see if another transaction got in first.
Some databases have native support for such columns. E.g. SQL Server has the timestamp/rowversion datatype. Oracle has rowdependencies. DB2 has rowversion.

How should I reliably mark the most recent row in SQL Server table?

The existing design for this program is that all changes are written to a changelog table with a timestamp. In order to obtain the current state of an item's attribute we JOIN onto the changelog table and take the row having the most recent timestamp.
This is a messy way to keep track of current values, but we cannot readily change this changelog setup at this time.
I intend to slightly modify the behavior by adding an "IsMostRecent" bit to the changelog table. This would allow me to simply pull the row having that bit set, as opposed to the MAX() aggregation or recursive seek.
What strategy would you employ to make sure that bit is always appropriately set? Or is there some alternative you suggest which doesn't affect the current use of the logging table?
Currently I am considering a trigger approach, which turns the bit off all other rows, and then turns it on for the most recent row on an INSERT
I've done this before by having a "MostRecentRecorded" table which simply has the most recently inserted record (Id and entity ID) fired off a trigger.
Having an extra column for this isn't right - and can get you into problems with transactions and reading existing entries.
In the first version of this it was a simple case of
BEGIN TRANSACTION
INSERT INTO simlog (entityid, logmessage)
VALUES (11, 'test');
UPDATE simlogmostrecent
SET lastid = ##IDENTITY
WHERE simlogentityid = 11
COMMIT
Ensuring that the MostRecent table had an entry for each record in SimLog can be done in the query but ISTR we did it during the creation of the entity that the SimLog referred to (the above is my recollection of the first version - I don't have the code to hand).
However the simple version caused problems with multiple writers as could cause a deadlock or transaction failure; so it was moved into a trigger.
Edit: Started this answer before Richard Harrison answered, promise :)
I would suggest another table with the structure similar to below:
VersionID TableName UniqueVal LatestPrimaryKey
1 Orders 209 12548
2 Orders 210 12549
3 Orders 211 12605
4 Orders 212 10694
VersionID -- being the tables key
TableName -- just in case you want to roll out to multiple tables
UniqueVal -- is whatever groups multiple rows into a single item with history (eg Order Number or some other value)
LatestPrimaryKey -- is the identity key of the latest row you want to use.
Then you can simply JOIN to this table to return only the latest rows.
If you already have a trigger inserting rows into the changelog table this could be adapted:
INSERT INTO [MyChangelogTable]
(Primary, RowUpdateTime)
VALUES (#PrimaryKey, GETDATE())
-- Add onto it:
UPDATE [LatestRowTable]
SET [LatestPrimaryKey] = #PrimaryKey
WHERE [TableName] = 'Orders'
AND [UniqueVal] = #OrderNo
Alternatively it could be done as a merge to capture inserts as well.
One thing that comes to mind is to create a view to do all the messy MAX() queries, etc. behind the scenes. Then you should be able to query against the view. This way would not have to change your current setup, just move all the messiness to one place.

SQL Server concurrency

I asked two questions at once in my last thread, and the first has been answered. I decided to mark the original thread as answered and repost the second question here. Link to original thread if anyone wants it:
Handling SQL Server concurrency issues
Suppose I have a table with a field which holds foreign keys for a second table. Initially records in the first table do not have a corresponding record in the second, so I store NULL in that field. Now at some point a user runs an operation which will generate a record in the second table and have the first table link to it. If two users simultaneously try to generate the record, a single record should be created and linked to, and the other user receives a message saying the record already exists. How do I ensure that duplicates are not created in a concurrent environment?
The steps I need to carry out are:
1) Look up x number of records in table A
2) Perform some business logic that prepares a single row which is inserted into table B
3) Update the records selected in step 1) to point to the newly created record in table B
I can use scope_identity() to retrieve the primary key of the newly created record in table B, so I don't need to worry about the new record being lost due to simultaneous transactions. However I need to eliminate the possibility of concurrently executing processes resulting in a duplicate record in table B being created.
In SQL Server 2008, this can be handled with a filtered unique index:
CREATE UNIQUE INDEX ix_MyIndexName ON MyTable (FKField) WHERE FkField IS NOT NULL
This will require all non-null values be unique, and the database will enforce it for you.
The 2005 way of simulating a unique filtered index for constraint purposes is
CREATE VIEW dbo.EnforceUnique
WITH SCHEMABINDING
AS
SELECT FkField
FROM dbo.TableB
WHERE FkField IS NOT NULL
GO
CREATE UNIQUE CLUSTERED INDEX ix ON dbo.EnforceUnique(FkField)
Connections that update the base table will need to have the correct SET options but unless you are using non default options this will be the case anyway in SQL Server 2005 (ARITH_ABORT used to be the problem one in 2000)
Using a computed column
ALTER TABLE MyTable ADD
OneNonNullOnly AS ISNULL(FkField, -PkField)
CREATE UNIQUE INDEX ix_OneNullOnly ON MyTable (OneNonNullOnly);
Assumes:
FkField is numeric
no clash of FkField and -PkField values
Decided to go with the following:
1) Begin transaction
2) UPDATE tableA SET foreignKey = -1 OUTPUT inserted.id INTO #tempTable
FROM (business logic)
WHERE foreignKey is null
3) If ##rowcount > 0 Then
3a) Create record in table 2.
3b) Capture ID of newly created record using scope_identity()
3c) UPDATE tableA set foreignKey = IdOfNewRecord FROM tableA INNER JOIN #tempTable ON tableA.id = tempTable.id
Since I write junk into the foreign key field in step 2), those rows are locked and no concurrent transactions will touch them. The first transaction is free to create the record. After the transaction is committed, the blocked transaction will execute the update query, but won't capture any of the original rows due to the WHERE clause only considering NULL foreignKey fields. If no rows are returned (##rowcount = 0), the current transaction exits without creating the record in table B, and returns some sort of error message to the client. (e.g. Error: Record already exists)