Update Zero Rows and then Committing?

Update Zero Rows and then Committing? - sql

Which procedure is more performant for an update which affects zero rows?
UPDATE table SET column = value WHERE id = number;
IF SQL%Rowcount > 0 THEN
COMMIT;
END IF;
or
UPDATE table SET column = value WHERE id = number;
COMMIT;
In other words if an Update affect ZERO rows and a commit is issued am I incurring any added expense at all?
I have a system which is being hampered by log file sync waits... and I'm wondering if issuing a commit; against a transaction which affects zero rows will write that statement to the log or not and thus cause more contention on LGWR.

COMMIT does force the log file sync so the system will have to wait indeed.
However, ROLLBACK does too and at some time either of them will have to happen.
So if you issue neither COMMIT nor ROLLBACK, you are just staying with an open transaction which sooner or later will cause a log sync wait.
Probably, you want to batch you UPDATE operations rather than waiting for a first successful update and committing it.

There are risks in this. Technically while the UPDATE may affect zero rows, it can fire before or after update triggers on the table (not at row level). Those triggers could potentially "do something" that requires a commit/rollback.
Safer to check to see if LOCAL_TRANSACTION_ID is set.

There are any number of reasons which can underlie waits for log file sync. It seems unlikely that the main culprit is committing SQL statements which have updated zero rows. It is true that issuing too many commits can be the cause of this problem. For instance, if the application is set up to commit after every statement (e.g. by using AUTOCOMMIT=TRUE) instead of designing proper transactions. If this is the cause then there is not much you can do, short of a major rewrite of the application.
If you want to delve deeper into the root causes of your problem I recommend you read this exhaustive (and exhausting) article by Pythian's Riyaj Shamsudeen on Tuning ‘log file sync’ Event Waits.

Related

How can I read dirty values in SQL UPDATE statement WHERE clause

Let's assume I have the following query in two separate SSMS query windows:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
BEGIN TRANSACTION
UPDATE dbo.Jobs
SET [status] = 'Running'
OUTPUT Inserted.*
WHERE [status] = 'Waiting'
--I'm NOT committing yet
--Commit Transaction
I run query window 1 (but do not commit), and then I run query window 2.
I want for query window 2 to immediately update only rows that were inserted after I started query 1 (all new records come in with a status of 'Waiting'). However, SQL Server is waiting for the first query to finish, because in an update statement it's not reading dirty values (even if it's set to READ UNCOMMITTED);
Is there a way to overcome this?
In my application I will have 2 (or more) processes running it, I want that process 2 should be able to pickup the rows that process 1 have not picked up; I don't want that process 2 should need to wait until process 1 is finish

What you are asking for is simply impossible.
Even at the lowest isolation level of READ UNCOMMITTED (aka NOLOCK), an X-Lock (exclusive) must be taken in order to make modifications. In other words, writes are always locked, even if the reads that fetched those rows were not locked.
So even though session 2 is running under READ UNCOMMITTED also, if it wants to do a modification it must also take an X-Lock, which is incompatible with the first X-Lock.
The solution here is to either do this in one session, or commit immediately. In any case, do not hold locks for any length of time, as it can cause massive blocking chains and even deadlocks.
If you want to just ignore all those rows which have been inserted, you could use the WITH (READPAST) hint.
READ UNCOMMITTED as an isolation level or as a hint has huge issues.
It can cause anything from deadlocks to completely incorrect results. For example, you could read a row twice, or not at all, when by the logical definition of the schema there should have been exactly one row. You could read entire pages twice or not at all.
You can get deadlocks due to U-Locks not being taken in UPDATE and DELETE statements.
And you still take schema locks, so you can still get stuck behind a synchronous statistics update or an index rebuild.

Understanding locks and query status in Snowflake (multiple updates to a single table)

While using the python connector for snowflake with queries of the form
UPDATE X.TABLEY SET STATUS = %(status)s, STATUS_DETAILS = %(status_details)s WHERE ID = %(entry_id)s
, sometimes I get the following message:
(snowflake.connector.errors.ProgrammingError) 000625 (57014): Statement 'X' has locked table 'XX' in transaction 1588294931722 and this lock has not yet been released.
and soon after that
Your statement X' was aborted because the number of waiters for this lock exceeds the 20 statements limit
This usually happens when multiple queries are trying to update a single table. What I don't understand is that when I see the query history in Snowflake, it says the query finished successfully (Succeded Status) but in reality, the Update never happened, because the table did not alter.
So according to https://community.snowflake.com/s/article/how-to-resolve-blocked-queries I used
SELECT SYSTEM$ABORT_TRANSACTION(<transaction_id>);
to release the lock, but still, nothing happened and even with the succeed status the query seems to not have executed at all. So my question is, how does this really work and how can a lock be released without losing the execution of the query (also, what happens to the other 20+ queries that are queued because of the lock, sometimes it seems that when the lock is released the next one takes the lock and have to be aborted as well).
I would appreciate it if you could help me. Thanks!

Not sure if Sergio got an answer to this. The problem in this case is not with the table. Based on my experience with snowflake below is my understanding.
In snowflake, every table operations also involves a change in the meta table which keeps track of micro partitions, min and max. This meta table supports only 20 concurrent DML statements by default. So if a table is continuously getting updated and getting hit at the same partition, there is a chance that this limit will exceed. In this case, we should look at redesigning the table updation/insertion logic. In one of our use cases, we increased the limit to 50 after speaking to snowflake support team

UPDATE, DELETE, and MERGE cannot run concurrently on a single table; they will be serialized as only one can take a lock on a table at at a time. Others will queue up in the "blocked" state until it is their turn to take the lock. There is a limit on the number of queries that can be waiting on a single lock.
If you see an update finish successfully but don't see the updated data in the table, then you are most likely not COMMITting your transactions. Make sure you run COMMIT after an update so that the new data is committed to the table and the lock is released.
Alternatively, you can make sure AUTOCOMMIT is enabled so that DML will commit automatically after completion. You can enable it with ALTER SESSION SET AUTOCOMMIT=TRUE; in any sessions that are going to run an UPDATE.

Postgres could not serialize access due to concurrent update

I have an issue with "could not serialize access due to concurrent update". I checked logs and I can clearly see that two transactions were trying to update a row at the same time.
my sql query
UPDATE sessionstore SET valid_until = %s WHERE sid = %s;
How can I tell postgres to "try" update row without throwing any exception?

There is a caveat here which has been mentioned in comments. You must be using REPEATABLE READ transaction isolation or higher. Why? That is not typically required unless you really have a specific reason.
Your problem will go away if you use standard READ COMMITTED. But still it’s better to use SKIP LOCKED to both avoid lock waits and redundant updates and wasted WAL traffic.
As of Postgres 9.5+, there is a much better way to handle this, which would be like this:
UPDATE sessionstore
SET valid_until = %s
WHERE sid = (
SELECT sid FROM sessionstore
WHERE sid = %s
FOR UPDATE SKIP LOCKED
);
The first transaction to acquire the lock in SELECT FOR UPDATE SKIP LOCKED will cause any conflicting transaction to select nothing, leading to a no-op. As requested, it will not throw an exception.
See SKIP LOCKED notes here:
https://www.postgresql.org/docs/current/static/sql-select.html
Also the advice about a savepoint is not specific enough. What if the update fails for a reason besides a serialization error? Like an actual deadlock? You don’t want to just silently ignore all errors. But these are also in general a bad idea - an exception handler or a savepoint for every row update is a lot of extra overhead especially if you have high traffic. That is why you should use READ COMMITTED and SKIP LOCKED both to simplify the matter, and any actual error then would be truly unexpected.

The canonical way to do that would be to set a checkpoint before the UPDATE:
SAVEPOINT x;
If the update fails,
ROLLBACK TO SAVEPOINT x;
and the transaction can continue.

The default isolation level is "read committed" unless you need to change it for any specific use case.
You must be using the "repeatable read" or "serializable" isolation level. Here, the current transaction will roll-back if the already running transaction updates the value which was also supposed to be updated by current transaction.
Though this scenario can be easily handled by the "read_commit" isolation level where the current transaction accepts an updated value from other transaction and perform its instructions after the previous transaction is committed
ALTER DATABASE SET DEFAULT_TRANSACTION_ISOLATION TO 'read committed';
Ref: https://www.postgresql.org/docs/9.5/transaction-iso.html

batch procedure, when to commit transactions?

I'm pretty new to PL-SQL although I've got lots of db experience with other RDBMS's. Here's my current issue.
procedure CreateWorkUnit
is
update workunit
set workunitstatus = 2 --workunit loaded
where
SYSDATE between START_DATE and END_DATE
and workunitstatus = 1 --workunit created;
--commit here?
call loader; --loads records based on status, will have a commit of its own
update workunit wu
set workunititemcount = (select count(*) from workunititems wui where wui.wuid = wu.wuid)
where workunitstatus = 2
So the behaviour I'm seeing, with or without commit statements is that I have to execute twice. Once to flip the statuses, then the loader will run on the second execution. I'd like it all to run in one go.
I'd appreciate any words of oracle wisdom.
Thanks!

When to commit transactions in a batch procedure? It is a good question, although it only seems vaguely related to the problems with the code you post. But let's answer it anyway.
We need to commit when the PL/SQL procedure has completed a unit of work. A unit of work is a business transaction. This would normally be at the end of the program, the last statement before the EXCEPTION section.
Sometimes not even then. The decision to commit or rollback properly lies with the top of the calling stack. If our PL/SQL is being called from a client (may a user clicking a button on a screen) then perhaps the client should issue the commit.
But it is not unreasonable for a batch process to manage its own commit (and rollback in the case of errors). But the main point is that the only the toppermost procedure should issue COMMIT. If a procedure calls other procedures those called programs should not issue commits or rollbacks. If they should handle any errors (log etc) and re-raise them to the calling program. Let it decode whether to rollback. Because all the called procedures run in the same session and hence the same transaction: a rollback in a called program will revert all the changes in the batch process. That's not right. The same reasoning applies to commits.
You will sometimes read advice on using intermittent commits to break up long running processes into smaller units e.g. every 1000 inserts. This is bad advice for several reasons, not all of them related to transactions. The pertinent ones are:
Issuing a commit frees locks on resources. This is the cause of ORA-1555 Snapshot too old errors.
It also affects read consistency, which only applies at the statement and/or transaction level. This is the cause of ORA-1002 Fetch out of sequence errors.
It affects re-startability. If the program fails having processed 30% of the records, can we be confident it will only process the remaining 70% when we re-run the batch?
Once we commit records other sessions can see those changes: does it make sense for other users to see a partially changed view of the data?
So, the words of "Oracle wisdom" are: always align the database transaction with the business transaction, with a single commit per unit of work.
Somebody mentioned autonmous transactions as a way of issuing commits in sub-processes. This is usually a bad idea. Changes made in an autonomous transaction are visible to other sessions but not to our own. That very rarely makes sense. It also creates the same problems with re-startability which I discussed earlier.
The only acceptable use for automomous transactions is recording activity (error log, trace, audit records). We need that data to persist regardless of what happens in the wider transaction. Any other use of the pragma is almost certainly a workaround for a porr design, which actually just makes the problem worse.

You may not need to commit in pl/sql procedure. the procedures that you call inside another procedure will use same session so you don't need to commit. by the way procedure must completely rollback if it session rollbacked or has an exception.

I mis-classfied my problem. I thought this was a transaction problem and really it was one of my flags not being set as expected.A number field was null when I was expecting 0.
Sorry for that.
Josh Robinson

With TSQL SNAPSHOT ISOLATION, is it possible to update only non-locked rows?

A large SQL Server 2008 table is normally being updated in (relatively) small chunks using a SNAPSHOT ISOLATION transaction. Snapshot works very well for those updates since the chunks never overlap. These updates aren't a single long running operation, but many small one-row insert/update grouped by the transaction.
I would like a lower priority transaction to update all the rows which aren't currently locked. Does anyone know how I can get this behavior? Will another SNAPSHOT ISOLATION transaction fail as soon as it a row clashes, or will it update everything it can before failing?
Could SET DEADLOCK_PRIORITY LOW with a try-catch be of any help? Maybe in a retry loop with a WHERE which targets only rows which haven't been updated?

Snapshot isolation doesn't really work that way; the optimistic locking model means it won't check for locks or conflicts until it's ready to write/commit. You also can't set query 'priority' per se, nor can you use the READPAST hint on an update.
Each update is an implicit atomic transaction so if 1 update out of 10 fails (in a single transaction) they all roll back.
SET DEADLOCK_PRIORITY only sets a preference for which transaction is rolled back in the event of a dealdlock (otherwise the 'cheapest' rollback is selected).
A try-catch is pretty much a requirement if you're expecting regular collisions.
The retry loop would work as would using a different locking model and the NOWAIT hint to skip queries that would be blocked.

SNAPSHOT ISOLATION transaction fails as soon as it encounters an update conflict. However, I would use some queue outside the database to prioritize updates.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas