Does Postgres log implicit transactions? - sql

Postgres docs state
PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not issue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT wrapped around it. A group of statements surrounded by BEGIN and COMMIT is sometimes called a transaction block.
SELECT statement logs aren't wrapped in BEGIN and COMMIT when I set log_statement='all' (as per How to log PostgreSQL queries?). INSERT logs, on the other hand, are wrapped in BEGIN and COMMIT.
Are implicit transactions excluded from logs?
Related: Does Postgresql implicitly wrap select statements in transaction?

Does Postgres log implicit transactions?
No.
The logs show explicit SQL statements from clients. The implicit transactions around standalone statements weren't controlled by statements, so BEGIN and COMMIT don't turn up in the logs.

Related

Looping through a CURSOR in PL/PGSQL without locking the table

I have a simple PL/PGSQL block Postgres 9.5 that loops over records in a table and conditionally updates some of the records.
Here's a simplified example:
DO $$
DECLARE
-- Define a cursor to loop through records
my_cool_cursor CURSOR FOR
SELECT
u.id AS user_id,
u.name AS user_name,
u.email AS user_email
FROM users u
;
BEGIN
FOR record IN my_cool_cursor LOOP
-- Simplified example:
-- If user's first name is 'Anjali', set email to NULL
IF record.user_name = 'Anjali' THEN
BEGIN
UPDATE users SET email = NULL WHERE id = record.user_id;
END;
END IF;
END LOOP;
END;
$$ LANGUAGE plpgsql;
I'd like to execute this block directly against my database (from my app, via the console, etc...). I do not want to create a FUNCTION() or stored procedure to do this operation.
The Issue
The issue is that the CURSOR and LOOP create a table-level lock on my users table, since everything between the outer BEGIN...END runs in a transaction. This blocks any other pending queries against it. If users is sufficiently large, this locks it up for several seconds or even minutes.
What I tried
I tried to COMMIT after each UPDATE so that it clears the transaction and the lock periodically. I was surprised to see this error message:
ERROR: cannot begin/end transactions in PL/pgSQL
HINT: Use a BEGIN block with an EXCEPTION clause instead.
I'm not quite sure how this is done. Is it asking me to raise an EXCEPTION to force a COMMIT? I tried reading the documentation on Trapping Errors but it only mentions ROLLBACK, so I don't see any way to COMMIT.
How do I periodically COMMIT a transaction inside the LOOP above?
More generally, is my approach even correct? Is there a better way to loop through records without locking up the table?
1.
You cannot COMMIT within a PostgreSQL function or DO command at all (plpgsql or any other PL). The error message you reported is to the point (as far as Postgres 9.5 is concerned):
ERROR: cannot begin/end transactions in PL/pgSQL
A procedure could do that in Postgres 11 or later. See:
PostgreSQL cannot begin/end transactions in PL/pgSQL
In PostgreSQL, what is the difference between a “Stored Procedure” and other types of functions?
There are limited workarounds to achieve "autonomous transactions" in older versions:
How do I do large non-blocking updates in PostgreSQL?
Does Postgres support nested or autonomous transactions?
Do stored procedures run in database transaction in Postgres?
But you do not need any of this for the presented case.
2.
Use a simple UPDATE instead:
UPDATE users
SET email = NULL
WHERE user_name = 'Anjali'
AND email IS DISTINCT FROM NULL; -- optional improvement
Only locks the rows that are actually updated (with corner case exceptions). And since this is much faster than a CURSOR over the whole table, the lock is also very brief.
The added AND email IS DISTINCT FROM NULL avoids empty updates. Related:
Update a column of a table with a column of another table in PostgreSQL
How do I (or can I) SELECT DISTINCT on multiple columns?
It's rare that explicit cursors are useful in plpgsql functions.
If you want to avoid locking rows for a long time, you could also define a cursor WITH HOLD, for example using the DECLARE SQL statement.
Such cursors can be used across transaction boundaries, so you could COMMIT after a certain number of updates. The price you are paying is that the cursor has to be materialized on the database server.
Since you cannot use transaction statements in functions, you will either have to use a procedure or commit in your application code.

Why use Implicit and Explicit Transaction Modes in SQL Server?

Reading MS documentation on different transaction modes in SQL Server.
Autocommit mode does everything the Implicit and Explicit Transaction mode does with less code, so why should I use Implicit and Explicit Transaction modes in my code ?
Autocommit transaction is only for single query. If you need transaction involving multiple queries, you must use Implicit and Explicit Transaction mode.
As you know sqlserver automatically done the job of transaction commit. But some time we need to commit /rollback on particular condition/logic/business rule(s).
For example, we have one master table and 3 child/details table or say 1 or more child tables.
Suppose, we must save master table entry along with all details table with reference of master table's pk-id. In any case anything an issue then revert whole thing.
So in this scenerio we need to use explicit transaction to commit or rollback as a unit of work. We can use try..catch block for error handling and rollback the transaction.
If we not used this transaction, then after each insert statement sqlserver auto-commit the inserted row and not rollback ever.

When do I have to commit?

I heard in SQL I do not have to commit every statement. Perhaps create I don't have to.
So can you answer me which Statements I have to commit?
I read, that I have to commit all transactions, but I don't know what this is and can't find it anywhere.
Thanks for your help.
Per the SQL standard, most statements that require a transaction will automatically open one.
Some database engines, such as SQL Server, will (by default) automatically commit the transaction if the statement completes successfully. See Autocommit Transactions.
Autocommit mode is the default transaction management mode of the SQL Server Database Engine. Every Transact-SQL statement is committed or rolled back when it completes
SQL Server also has an Implicit Conversions mode which will leave the transaction open until it's explicitly commited.
When operating in this second such mode (which is the default, I believe, for Oracle), or if you've explicitly created a transaction, it's up to you as a developer when to commit the transaction. It should be when you've accomplished a "complete" set of operations against the database.
If you BEGIN a transaction then you have to either ROLLBACK or COMMIT
Example:
BEGIN TRAN
--Your code
INSERT INTO
NewTable
SELECT *
FROM TABLE
COMMIT TRAN
If you do not use that, it is committed upon execution. So the follow will either fail or be committed:
INSERT INTO
NewTable
SELECT *
FROM Table
If there is an error (like there is no NewTable in the DB) the execution will raise an error and the transaction will roll back. If there is no error the transaction will be committed.

How does SQL Server treat statements inside stored procedures with respect to transactions?

Say I have a stored procedure consisting of several separate SELECT, INSERT, UPDATE and DELETE statements. There is no explicit BEGIN TRANS / COMMIT TRANS / ROLLBACK TRANS logic.
How will SQL Server handle this stored procedure transaction-wise? Will there be an implicit connection for each statement? Or will there be one transaction for the stored procedure?
Also, how could I have found this out on my own using T-SQL and / or SQL Server Management Studio?
Thanks!
There will only be one connection, it is what is used to run the procedure, no matter how many SQL commands within the stored procedure.
since you have no explicit BEGIN TRANSACTION in the stored procedure, each statement will run on its own with no ability to rollback any changes if there is any error.
However, if you before you call the stored procedure you issue a BEGIN TRANSACTION, then all statements are grouped within a transaction and can either be COMMITted or ROLLBACKed following stored procedure execution.
From within the stored procedure, you can determine if you are running within a transaction by checking the value of the system variable ##TRANCOUNT (Transact-SQL). A zero means there is no transaction, anything else shows how many nested level of transactions you are in. Depending on your sql server version you could use XACT_STATE (Transact-SQL) too.
If you do the following:
BEGIN TRANSACTION
EXEC my_stored_procedure_with_5_statements_inside #Parma1
COMMIT
everything within the procedure is covered by the transaction, all 6 statements (the EXEC is a statement covered by the transaction, 1+5=6). If you do this:
BEGIN TRANSACTION
EXEC my_stored_procedure_with_5_statements_inside #Parma1
EXEC my_stored_procedure_with_5_statements_inside #Parma1
COMMIT
everything within the two procedure calls are covered by the transaction, all 12 statements (the 2 EXECs are both statement covered by the transaction, 1+5+1+5=12).
You can find out on your own by creating a small stored procedure that does something simple, say insert a record into a test table. Then Begin Tran; run sp_test; rollback; Is the new record there? If so, then the SP ignores the outside transaction. If not, then the SP is just another statement executed inside the transaction (which I am pretty sure is the case).
You must understand that a transaction is a state of the session. The session can be in an explicit transaction state because there is at least one BEGIN TRANSACTION that have been executed in the session wherever the command "BEGIN TRANSACTION" has been throwed (before entering in a routine or inside the routine code). Otherwise, the state of the session is in an implicit transaction state. You can have multiple BEGIN TRANSACTION, but only the first one change the behavior of the session... The others only increase the ##TRANCOUNT global sesion variable.
Implicit transaction state means that all SQL orders (DDL, DML and DCL comands) wil have an invisble integrated transaction scope.

Is there a difference between a SELECT statement inside a transaction and one that is outside of it?

Does the default READ COMMITTED isolation level somehow makes the SELECT statement act different inside of a transaction than one that is not in a transaction?
I am using MS SQL.
Yes, the one inside the transaction can see changes made by other previous Insert/Update/delete statements in that transaction; a Select statement outside the transaction cannot.
If all you are asking about is what the Isolation Level does, then understand that all Select statements (hey, all statements of any kind) - are in a transaction. The only difference between one that is explicitly in a transaction and one that is standing on its own is that the one that is standing alone starts its transaction immediately before it executes it, and commits or roll back immediately after it executes;
whereas the one that is explicitly in a transaction can (because it has a Begin Transaction statement) can have other statements (inserts/updates/deletes, whatever) occurring within that same transaction, either before or after that Select statement.
So whatever the isolation level is set to, both selects (inside or outside an explicit transaction) will nevertheless be in a transaction which is operating at that isolation level.
Addition:
The following is for SQL Server, but all databases MUST work in the same way. In SQL Server the Query Processor is always in one of 3 Transaction Modes, AutoCommit, Implicit, or Explicit.
AutoCommit is the default transaction management mode of the SQL Server Database Engine. .. Every Transact-SQL statement is committed or rolled back when it completes. ... If a statement completes successfully, it is committed; if it encounters any error, it is rolled back. This is the default, and is the answer to #Alex's question in the comments.
In Implicit Transaction mode, "... the SQL Server Database Engine automatically starts a new transaction after the current transaction is committed or rolled back. You do nothing to delineate the start of a transaction; you only commit or roll back each transaction. Implicit transaction mode generates a continuous chain of transactions. ..." Note that the italicized snippet is for each transaction, whether it be a single or multiple statement transaction.
The engine is placed in Explicit Transaction mode when you explicitly initiate a transaction with BEGIN TRANSACTION Statement. Then, every statement is executed within that transaction until you explicitly terminate the transaction (with COMMIT or ROLLBACK) or if a failure occurs that causes the engine to terminate and Rollback.
Yes, there is a bit of a difference. For MySQL, the database doesn't actually start with a snapshot until your first query. Therefore, it's not begin that matters, but the first statement within the transaction. If I do the following:
#Session 1
begin; select * from table;
#Session 2
delete * from table; #implicit autocommit
#Session 1
select * from table;
Then I'll get the same thing in session one both times (the information that was in the table before I deleted it). When I end session one's transaction (commit, begin, or rollback) and check again from that session, the table will show as empty.
The READ COMMITTED isolation level is about the records that have been written. It has nothing to do with whether or not this select statement is in a transaction (except for those things written during that same transaction).
If your database (or in mysql, the underlying storage engine of all tables used in your select statement) is transactional, then there simply no way to execute it "outside of a transaction".
Perhaps you meant "run it in autocommit mode", but that is not the same as "not transactional". In the latter case, it still runs in a transaction, it's just that the transaction ends immediately after your statement is finshed.
So, in both cases, during the run, a single select statement will be isolated at the READ COMMITTED level from the other transactions.
Now what this means for your READ COMMITTED transaction isolation level: perhaps surprisingly, not that much.
READ COMMITTED means that you may encounter non-repeatable reads: when running multiple select statements in the same transaction, it is possible that rows that you selected at a certain point in time are modified and comitted by another transaction. You will be able to see those changes when you re-execute the select statement later on in the same pending transaction. In autocommit mode, those 2 select statements would be executed in their own transaction. If another transaction would have modified and committed the rows you selected the first time, you would be able to see those changes just as well when you executed the statement the second time.