Scenario:
We have a wcf workflow with a client that does NOT use transactionflow.
The workflow contains several sequential TransactedReceiveScopes (using content-based correlation).
The TransactedReceiveScopes contain custom db operations.
Observations:
When we run SQL profiler against the first call, we see all the custom db calls, and the SaveInstance call in the profile trace.
We've noticed that, even though the SendReply is at the very end of TransactedReceiveScope, sometimes the sendreply occurs a good 10 seconds before the transaction gets committed.
We tried changing the TimeToPersist and TimeToUnload to zero, but that had no effect. (The trace shows the SaveInstance happening immediately anyway, but rather the commit seems to be delayed).
Questions:
Are our observations correct?
At what point is the transaction committed? Is this like garbage collection - i.e. it commits some time later when it's not busy?
Is there any way to control the commit delay, or is the only way to do this to use transactionflow from the client (anc then it should all commit when the client commits, including the persist).
The TransactedReceiveScope commits the transaction when the body is completed but as all execution is done through the scheduler that could be some time later. It is not related to garbage collection and there is no real way to influence it other that to avoid a busy machine and a lot of other parallel activities that could also be in the execution queue.
Related
I want to implement an audit log using triggers which gets fired on created, changed and deleted data to store some values. Those triggers should be able to use user ids which made the changes and which are managed by the web application. I have some ideas on providing this data, but I don't seem to fully understand what the execution context of a trigger is. I've read through the PostgreSQL docs Overview of Trigger Behavior and others but my question doesn't seem to be answered.
What I want to know is the interaction between a client session with one running transaction and the trigger execution and the lifetime of both and how they depend on each other. From my understanding triggers are executed within the database independently from the client session which created the event which lead to trigger execution. Is that correct? That would mean triggers and their processing wouldn't impact performance of the client request and the client can close the session at any time. If both are independent, how would a trigger get notified about a client rolling back a transaction, which would logically mean that no data got changed at all? Or are triggers onyl executed after committing a transaction because they run independently?
Or are triggers executed async within the client session which created the events which lead to trigger execution? This would mean that if the client closes it's session for any reason, the trigger would abort, too. Their changes are directly bound to the clients transaction and can be rolled back, too.
I need to understand the behavior to know what I would like to do in another question.
Thanks for your input!
From my understanding triggers are executed within the database
independently from the client session which created the event which
lead to trigger execution. Is that correct? That would mean triggers
and their processing wouldn't impact performance of the client request
and the client can close the session at any time
No they totally depend on the client session, as part of the transaction which itself is tied to the session.
See this excerpt from CREATE TRIGGER (9.1):
They can be fired either at the end of the statement causing the
triggering event, or at the end of the containing transaction; in the
latter case they are said to be deferred
From your other question it appears you're using 8.4, which doesn't have deferred triggers, so it's even simpler. Triggers run always at the end of the statement (the triggering event), which means before the acknowledgment of execution is sent by the server to the client.
A COMMIT immediately following would be a new instruction, and could not be executed before the trigger is finished.
We are trying to implement retry logic to recover from transient errors in Azure environment.
We are using long-running sessions to keep track and commit the whole bunch of changes at the end of application transaction (which may spread over several web-requests). Along the way we need to get additional data from database. Our main problem is that we can't easily recover from db error because we can't "replay" all user actions.
So far we used straightforward recovery algorithm:
Try to perform operation in long-running session
In case of error, close the session, open a new one and merge entities into it
Retry the operation
It's very expensive approach in terms of time (merge is really long for big entity hierarchies). So we'd like to optimize things a little.
We'd like to perform query operations in separate session (to keep long running one untouched and safe) and on success, merge results back to the long-running session. Retry is relatively simple here - we just need to open new session and run query once more. However, with this approach we have an issue with initializing lazy properties/collections:
If we do this in separate session, we need to merge results back (a lot of entities) but merge could fail and break the long-running session
We tried different ways of "moving" original entity to different session, loading details and returning it back, but without success (evict, replicate, etc.)
There is known statement that session should be discarded in case of exception. However, the example shows write operation. Is it still true for read ones? I mean if I guarantee that no data is written back to the database, can I reuse the same session to run query again?
Do you have any other suggestions about retry logic with long-running sessions?
IMO there's no way to solve your issue. It's gonna take a lot of time to commit everything or you will have to do a lot of work to break it up into smaller sessions and handle every error that can occur while merging.
To answer your question about using the session after an exception: you cannot trust ANYTHING anymore inside this session, not even loaded entities.
Read this paragraph from Ayende's article about building a simple todo app with a recoveryplan in case of an exception in the session:
Then there is the problem of error handling. If you get an exception
(such as StaleObjectStateException, because of concurrency conflict),
your session and its loaded entities are toast, because with
NHibernate, an exception thrown from a session moves that session into
an undefined state. You can no longer use that session or any loaded
entities. If you have only a single global session, it means that you
probably need to restart the application, which is probably not a good
idea.
According to several resources, such as this,
A query that is executed within the context of a trigger is automatically wrapped in a transaction. If there are any distributed queries in the trigger code, the transaction is promoted to a distributed transaction automatically.
Simple question - is there a way to prevent this behavior? I'm looking for a way to explicitly prevent code in my trigger from running in the context of a transaction.
If you are trying to do something asynchronous so that the calling transaction doesn't have to wait, you may consider Service Broker, which is designed to do exactly that - go fire off some asynchronous task, and return control to the caller, regardless of transaction scope.
Another idea is to not have your trigger perform the work, but instead pop a work item onto a queue table, and have a background process running continuously to process the queue. This isn't necessarily easy to do if your work item operates on the set of data in inserted/deleted but without more context it certainly seems like a viable option.
I don't know of a way to prevent a trigger from being a part of the calling transaction - in fact that's kind of the whole point.
This is called "autonomous transaction", and the simplest way to implement is by creating a linked server to point to the original database.
See this MSDN blog for a possible solution.
I would like to know how you would run a stored procedure from a page and just "let it finish" even if the page is closed. It doesn't need to return any data.
A database-centric option would be:
Create a table that will contain a list (or queue) of long-running jobs to be performed.
Have the application add an entry to the queue if, when, and as desired. That's all it does; once logged and entered, no web session or state data need be maintained.
Have a SQL Agent job configured to check every 1, 2, 5, whatever minutes to see if there are any jobs to run.
If there are as-yet unstarted items, mark the most recent one as started, and start it.
When it's completed, mark it as completed, or just delete it
Check if there are any other items to run. If there are, repeat; if not, exit the job.
Depending on capacity, you could have several (differently named) copies of this job running, concurrently processing items from the list.
(I've used this method for very long-running methods. It's more an admin-type trick, but it may be appropriate for your situation.)
Prepare the command first, then queue it in the threadpool. Just make sure the thread does not depend on any HTTP Context or any other http intrinsic object. If your request finishes before the thread; the context might be gone.
See Asynchronous procedure execution. This is the only method that guarantees the execution even if the ASP process crashes. It also self tuning and can handle spikes of load, requests are queued up and processed as resources become available.
The gist of the solution is leveraging the SQL Server Activation concept, which allows you to run a stored procedure in a background thread in SQL Server without a client connection.
Solutions based on SqlClient asynch methods or on CLR thread pool are unreliable, the calls are lost as the ASP process is recycled, and besides they build up in-memory queues of requests that actually trigger a process recycle due to memory consumption.
Solutions based on tables and Agent jobs are better, as they are reliable, but they lack the self tuning of Activation based solutions.
I have some question around transaction lock in oracle database. What I have found out so far is that:
Cause: The time to wait on a lock in a distributed transaction has been exceeded. This time is specified in the initialization parameter DISTRIBUTED_LOCK_TIMEOUT.
Action: This situation is treated as a deadlock and the statement was rolled back. To set the time-out interval to a longer interval, adjust the initialization parameter DISTRIBUTED_LOCK_TIMEOUT, then shut down and restart the instance.
Some other things that I want to know in more details are things like:
It is mentioned that a lock in 'distributed transaction' happened. So what kind of database operation that can cause this ? Updating a record ? Selecting a record ?
What does 'Distributed' means anyway. I have seen this term coined all over the place, but I can't seem to deduce what it means.
What can we do to reduce instances of such lock ?
A distributed transaction means that you had a transaction that had two different participants. If you are using PL/SQL, that generally implies that there are multiple databases involved. But it may simply indicate that an application is using an external transaction coordinator in its interactions with the database. A J2EE application, for example, might want to create a distributed transaction that covers both issuing SQL statements against a database to move $100 from account A to account B as well as the application server action of creating a JMS message for this transaction that would eventually cause an email notification of the transfer to be sent. In this case, the application wants to ensure that the state of the middle tier matches the state of the back end.
Distributed transactions are not free. They involve potentially quite a bit of additional overhead because, at a minimum, you need to use the two-phase commit protocol to verify that all the components that are part of the distributed transaction are ready to commit and to verify that they all did commit. That involves sending a number of network packets which can be a significant fraction of the time an OLTP transaction is waiting. Distributed transactions also cause administrative issues because you end up with cases where one participant's transaction fails after it indicated it was ready to commit or a transaction coordinator failing while various participants have open transactions.
So the first question would be whether your application actually needs distributed transactions. Sometimes, developers find that they are accidentally requesting distributed transactions when they really aren't necessary. If you're not sure what a distributed transaction is, it's entirely possible that you don't really need them.
There is a guide here that will walk you through the steps to simulate an ORA-02049: timeout: distributed transaction waiting for lock if you want a better understanding of one of its causes: