Let's say I need to run some queries AND send an email in an atomic way.
A typical example is for a user sign-up form, I need to create the user and send the welcome email.
I can use a transaction :
begin transaction
create user
if creation failed, rollback transaction, bailout
send email
if email sending failed, rollback transaction, bailout
commit transaction
but, in PostgreSQL, a commit can fail (for example when using DEFERRED constraints).
So the solution would be to use a two-phase commit :
begin transaction
create user
PREPARE TRANSACTION
if creation failed, rollback transaction, bailout
send email
if email sending failed, rollback transaction, bailout
COMMIT PREPARED // this one is guaranteed to work by Postgres
but the Postgres doc says :
It is unwise to leave transactions in the prepared state for a long time.
Moreover :
The intended usage of the feature is that a prepared transaction will normally be committed or rolled back as soon as an external transaction manager has verified that other databases are also prepared to commit.
So,
Is sending an email too much time ?
What problems could arise if we do so ?
What would be an acceptable timeout if it takes too long ?
Are database locks different in prepared state than in idle state (ie. before the commit) ?
Sending an email is not transactional. It is possible you will never know whether it was successfully delivered or not and clearly "forever" is too long to hold onto a prepared transaction.
You will want to structure the table so that partially created users can be committed, but still known to be unconfirmed. Like with a column indicating as much. That way troubleshooters can actually see the partially created user to decide what to do about them. With prepared transactions, the semi-committed rows are invisible, so no one can figure out what is going on.
To put it a somewhat different way, to use two phase commit effectively, you need to have a transaction manager. Do you? You didn't describe one. Are you planning to write your own? Do you know how much work that will be?
Related
I want to implement an audit log using triggers which gets fired on created, changed and deleted data to store some values. Those triggers should be able to use user ids which made the changes and which are managed by the web application. I have some ideas on providing this data, but I don't seem to fully understand what the execution context of a trigger is. I've read through the PostgreSQL docs Overview of Trigger Behavior and others but my question doesn't seem to be answered.
What I want to know is the interaction between a client session with one running transaction and the trigger execution and the lifetime of both and how they depend on each other. From my understanding triggers are executed within the database independently from the client session which created the event which lead to trigger execution. Is that correct? That would mean triggers and their processing wouldn't impact performance of the client request and the client can close the session at any time. If both are independent, how would a trigger get notified about a client rolling back a transaction, which would logically mean that no data got changed at all? Or are triggers onyl executed after committing a transaction because they run independently?
Or are triggers executed async within the client session which created the events which lead to trigger execution? This would mean that if the client closes it's session for any reason, the trigger would abort, too. Their changes are directly bound to the clients transaction and can be rolled back, too.
I need to understand the behavior to know what I would like to do in another question.
Thanks for your input!
From my understanding triggers are executed within the database
independently from the client session which created the event which
lead to trigger execution. Is that correct? That would mean triggers
and their processing wouldn't impact performance of the client request
and the client can close the session at any time
No they totally depend on the client session, as part of the transaction which itself is tied to the session.
See this excerpt from CREATE TRIGGER (9.1):
They can be fired either at the end of the statement causing the
triggering event, or at the end of the containing transaction; in the
latter case they are said to be deferred
From your other question it appears you're using 8.4, which doesn't have deferred triggers, so it's even simpler. Triggers run always at the end of the statement (the triggering event), which means before the acknowledgment of execution is sent by the server to the client.
A COMMIT immediately following would be a new instruction, and could not be executed before the trigger is finished.
I was wondering at point after issuing a SqliteConnection BeginTransaction() call does the Reserve Lock engage and start blocking writes ?
Does the Reserve Lock correspond to the actual BeginTransaction call, or is it only after Commit is called and the transaction run ?
I ask b/c, in order to leverage my existing data access layer, and not have to write custom transactions everytime I need one to prevent a race condition, I wan't to make a call to BeginTransaction(), and then call any combination of existing Select/Insert/Update wrappers to solve a problem at hand, while having exclusive write access, then finally call a Commit. In order to prevent the race conditions I'm trying to avoid, I would require that the reserve lock on the transaction is active "immediately" upon calling BeginTransaction (ie, sometime before it returns).
If more clarification or details are needed, please let me know and I'll be happy to provide them. Thank ya's kindly for your expertise.
The documentation says:
The first read operation against a database creates a SHARED lock and the first write operation creates a RESERVED lock.
When all changes fit into the page cache, the first actual write operation will happen during the COMMIT.
To force SQLite to take a lock when executing the BEGIN, start the transaction with BEGIN IMMEDIATE:
After a BEGIN IMMEDIATE, no other database connection will be able to write to the database or do a BEGIN IMMEDIATE or BEGIN EXCLUSIVE. Other processes can continue to read from the database, however.
If you want to prevent reads (which should not be necessary as all transactions are properly serialized in any case), use BEGIN EXCLUSIVE.
I'm looking for something similiar to an SQL transaction. I need the usual protections that transactions provide, but I don't want it to slow down anyone else.
Imagine client A connects to the DB and runs these commands:
BEGIN TRAN
SELECT (something)
(Wait a few seconds maybe.)
UPDATE (something)
COMMIT
Inbetween the SELECT and the UPDATE, client B comes along and attempts to do a query, that under normal circumstances, would end up having to wait for A to COMMIT.
What I'd like is for client A to open it's transaction in such a way that should B come along and perform it's query, client A will find it's transaction immediately rolled back and it's subsequent commands failing. Client B would only experience minimal delay.
(Note that the SELECT and UPDATE are simply illustrative commands.)
Update...
I've got a high priority task (client B) that sometimes (once a month-ish) gets an SQL timeout error, and a low priority task (client A) with a transaction which causes that timeout. I'd rather that the low priority task fails and is reattempted in the next cycle.
I ended up fixing this problem by eliminating the transactions entirely and replacing them with an informal set of flags. The queries were refactored to only do something if the right set of flags are raised and I added something that cleared up abandoned records that the rollback would have cleared in the past.
I fixed my transaction issues by eliminating transactions.
Using SNAPSHOT isolation level will prevent B from blocking. B will see data in the state they were before A issued BEGIN TRANSACTION. Unless B modifies data, they will never block each other.
While not a transaction at all, Optimistic Concurrency may be useful -- it is used by default in LINQ2SQL, etc.
The general idea is that the data is read -- modifications can be independently made -- and then the data written back with a "check" (this is loosely comparable to a Compare and Swap). If the check fails it is up the application to decide what to do (restart the process, proceed anyway, fail).
This naturally doesn't work for all scenarios and may not detect a number of interactions, such as new items added between the "read" and "write". Both the actual read and write can be in separate transactions with the appropriate isolation level; the separate transactions may allow additional transactions to be interleaved.
Of course, depending upon the exact problem and interactions... different isolation levels and/or finer grained locking may be sufficient.
Happy coding.
That is back to front.
You can't have later clients aborting earlier transactions: that's chaos.
You can have snapshot isolation so that client B has a consistent view and isn't blocked (mostly) by client A. Also Wikipedia for more general stuff
Perhaps describe your problem more fully so we can offer suggestions for that...
One thing that I've seen used (but I'm afraid that I don't have any code handy for it) is having transaction A spawn another process which then monitors the transaction. If it sees any blocks caused by the transaction then it immediately issues a KILL to the spid.
If I can find the code for this then I'll add it here.
I have application code that inserts a record in a local database and a record in a remote database (via Oracle Database Link). When I commit this distributed transaction is it guaranteed that both the local and remote databases will both commit or both rollback, or is there a chance that the remote one could commit but the local commit would fail (or vice versa)?
I'd be astonished if Oracle does not use the equivalent of the Two-Phase Commit (2PC) protocol, which ensures that either both commit or both rollback.
With 2PC, there is a stage called the pre-commit phase where the master (coordinator) instance records its own decision and tells all participants to get ready to commit (and report their status - must fail, or can commit). The participants also get ready to commit, and (if they can commit) await further instructions from the coordinator after telling the coordinator they are ready to commit. When all the participants have responded, the coordinator records the final decision, and sends that decision to the participants, and acts on its decision. If the master fails after recording the decision but before successfully sending the decision to the participants, the participants can be hung up in a state where they can neither commit nor rollback. There are ways to recover from that. If the coordinator stays down long enough (for example, is taken out of service as a result of catastrophic h/w failure) you can end up with problems; the participants end up doing a heuristic rollback (presumed rollback), typically - but this requires astonishingly bad luck to cause any trouble.
There are alternatives to 2PC; the net result is the same - all commit or all rollback.
I have some question around transaction lock in oracle database. What I have found out so far is that:
Cause: The time to wait on a lock in a distributed transaction has been exceeded. This time is specified in the initialization parameter DISTRIBUTED_LOCK_TIMEOUT.
Action: This situation is treated as a deadlock and the statement was rolled back. To set the time-out interval to a longer interval, adjust the initialization parameter DISTRIBUTED_LOCK_TIMEOUT, then shut down and restart the instance.
Some other things that I want to know in more details are things like:
It is mentioned that a lock in 'distributed transaction' happened. So what kind of database operation that can cause this ? Updating a record ? Selecting a record ?
What does 'Distributed' means anyway. I have seen this term coined all over the place, but I can't seem to deduce what it means.
What can we do to reduce instances of such lock ?
A distributed transaction means that you had a transaction that had two different participants. If you are using PL/SQL, that generally implies that there are multiple databases involved. But it may simply indicate that an application is using an external transaction coordinator in its interactions with the database. A J2EE application, for example, might want to create a distributed transaction that covers both issuing SQL statements against a database to move $100 from account A to account B as well as the application server action of creating a JMS message for this transaction that would eventually cause an email notification of the transfer to be sent. In this case, the application wants to ensure that the state of the middle tier matches the state of the back end.
Distributed transactions are not free. They involve potentially quite a bit of additional overhead because, at a minimum, you need to use the two-phase commit protocol to verify that all the components that are part of the distributed transaction are ready to commit and to verify that they all did commit. That involves sending a number of network packets which can be a significant fraction of the time an OLTP transaction is waiting. Distributed transactions also cause administrative issues because you end up with cases where one participant's transaction fails after it indicated it was ready to commit or a transaction coordinator failing while various participants have open transactions.
So the first question would be whether your application actually needs distributed transactions. Sometimes, developers find that they are accidentally requesting distributed transactions when they really aren't necessary. If you're not sure what a distributed transaction is, it's entirely possible that you don't really need them.
There is a guide here that will walk you through the steps to simulate an ORA-02049: timeout: distributed transaction waiting for lock if you want a better understanding of one of its causes: