I'm trying to figure out what is causing deadlocks in my Symfony 2 application. I'm running a cronjob that does batch-updates on a fairly large dataset and one part of it causes this error:
Doctrine\DBAL\DBALException: An exception occurred while executing
'UPDATE SpotEvent SET ts = ?, current = ? WHERE id = ?' with params
["2015-12-28 00:35:27", 1, 39316]: SQLSTATE[40P01]: Deadlock
detected: 7 ERROR: deadlock detected DETAIL: Process 32030 waits for
ShareLock on transaction 2130787; blocked by process 32029. Process
32029 waits for ShareLock on transaction 2130786; blocked by process
32030. HINT: See server log for query details. CONTEXT: while updating tuple (105,68) in relation "spotevent" (uncaught exception)
at
/home/maf/symfony/vendor/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php
line 91 while running console command
The code causing it is basically this:
check event
if (already in database) {
update timestamp
} else {
create new
}
From what I see in the error, the first branch causes the deadlock, but from what I read about deadlocks, the second should be more likely. In any case I don't understand why I have a deadlock at all.
I should say I am running this job in 6 parallel processes. However, there is no overlap between them (i.e. job one is checking from 1-200, job 2 from 201 to 400, etc.)
I'm using PostgreSQL as the database backend. My "check event" step is done using DQL, everything else is pure ORM.
Related
I've got a Talend job running with a couple of dataflows running in parallel against a Snowflake database. An update statement against Table A is causing an update on Table B to fail with the following error:
Transaction 'uuid-of-transaction', id 'a-very-long-integer-id', is being committed, SQL execution canceled.
Call END_OPERATION(999,'String1','String2','String3','String4','Success','0')
UPDATE TableB SET BATCH_KEY = 1234, LOAD_DT = current_timestamp::timestamp_ntz, KEY_HASH = MD5(TO_VARCHAR(ARRAY_CONSTRUCT(col1))), ROW_HASH = MD5(TO_VARCHAR(ARRAY_CONSTRUCT(col2, col3))) WHERE BATCH_KEY = -1 OR BATCH_KEY IS NULL;
The code for END_OPERATION is here:
var cmd =
"CALL END_OPERATION(:1,:2,:3,:4,:5,:6,null);";
try {
snowflake.execute (
{sqlText: cmd,binds: [BATCH_KEY,ENTITY,LAYER,SRC,OPERATION,OPERATION_STATUS].map(function(param){return param === undefined ? null : param})},
);
return "Succeeded.";
}
catch (err) {
return "Failed: " + err;
}
var cmd =
"UPDATE TableA SET OPERATION_STATUS=:6,END_DT=current_timestamp,ROW_COUNT=IFNULL(:7,ROW_COUNT) WHERE BATCH_KEY=:1 AND ENTITY_NAME=:2 AND LAYER_NAME=:3 AND SRC=:4 AND OPERATION_NAME=:5";
try {
snowflake.execute (
{sqlText: cmd,binds: [BATCH_KEY,ENTITY,LAYER,SRC,OPERATION,OPERATION_STATUS,ROW_COUNT].map(function(param){return param === undefined ? null : param})},
);
return "Succeeded.";
}
catch (err) {
return "Failed: " + err;
}
I'm failing to understand why the UPDATE statement against TableB is getting killed. It's getting killed nearly immediately.
Here we need to review the flow of all SQL statements coming from the Talend job within the same session in which the failing SQL command is run as well as all the statements coming from the other parallel job.
From the Query History we can get the SessionID of the session. From the History section of the Snowflake UI we can make a search based upon the SessionID. This will list all the commands run through this particular session.
We can review all the commands in their chronological order by sorting over the start_date column and try to observe the sequence of SQL statements.
Your point is indeed valid that an update on TableA should not affect an Update on TableB but after reviewing all the statements of both the sessions (we read that the Talend job is running a couple of dataflows in parallel) we may come across some SQL statement in one session which has taken a lock on tableB before the Update command is submitted against it from the other session.
Another thing which can be reviewed here is how the transaction is managed by the workflow. Within the same list of SQL queries in that session we need to check for any statements which sets the parameter Autocommit at the session level. If Autocommit it set to FALSE at the start of the session then the session will not release any of the table locks until an explicit commit is submitted.
Since the situation here sounds a bit unusual and complex, we may have to dig a little more deeper to review the execution logs of both the queries and for that we may have to contact the Snowflake support.
I have implemented a loop job in pentaho kettle. This job has "Simple evaluation" entry that checks if it should rerun the previous transformation or not based on a condition. The variable used in condition, is being set using "Set variable" step in previous transformation. The problem is that first time job runs perfectly, but when server is restarted, job stuck and throws error:
Unexpected error occurred while launching entry.
This error comes for both transformation and "Simple evaluation", and then job stuck in loop forever.
So I have two questions here:
Why does this error occur?
If an error occurs, transformation should stop but it is forwarding the flow to "Simple evaluation" and vice versa.
Below is the image of job :
crosspost: https://orchard.codeplex.com/discussions/473454
I want to start by saying I'm currently migrating from Orchard CMS 1.6 to 1.7.2. So it used to work in 1.6 but I'm now having issues with 1.7.2.
2 of my Content Types are having issues when creating items, they never finish saving and when I check the logs I get this:
Orchard.Alias.Implementation.Updater.AliasHolderUpdater - Exception during Alias refresh
NHibernate.Exceptions.GenericADOException: could not execute query
[ select aliasrecor0_.Id as Id1829_, aliasrecor0_.Path as Path1829_, aliasrecor0_.RouteValues as RouteVal3_1829_, aliasrecor0_.Source as Source1829_, aliasrecor0_.Action_id as Action5_1829_ from Orchard_Alias_AliasRecord aliasrecor0_ where aliasrecor0_.Id>#p0 order by aliasrecor0_.Id asc ]
Name:p1 - Value:48
[SQL: select aliasrecor0_.Id as Id1829_, aliasrecor0_.Path as Path1829_, aliasrecor0_.RouteValues as RouteVal3_1829_, aliasrecor0_.Source as Source1829_, aliasrecor0_.Action_id as Action5_1829_ from Orchard_Alias_AliasRecord aliasrecor0_ where aliasrecor0_.Id>#p0 order by aliasrecor0_.Id asc] ---> System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception: The wait operation timed out
When I stop it and view the site (anywhere really), it's entirely wrecked with this error:
Exception Details: System.ComponentModel.Win32Exception: The wait operation timed out
[Win32Exception (0x80004005): The wait operation timed out]
[SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.]
Line 162: return criteria
Line 163: .List<ContentItemVersionRecord>()
Line 164: .Select(x => ContentManager.Get(x.ContentItemRecord.Id, _versionOptions != null && _versionOptions.IsDraftRequired ? _versionOptions : VersionOptions.VersionRecord(x.Id)))
Source File: d:\Projects\Office Ignite\Main-1.7\src\Orchard\ContentManagement\DefaultContentQuery.cs Line: 162
I don't know why this is isolated with those two CTs. They don't have parts with custom tables or anything.
Any piece of information would be highly appreciated. Thanks!
I have same error, but it seems that problem is not related directly for my code.
I found two solutions for now:
1.) Taxonomy corruption problem https://orchard.codeplex.com/workitem/20411
2.) Static is dirty and lock which is default in select statment is heavly used https://serverfault.com/questions/419997/the-wait-operation-timed-out-when-running-sql-server-in-hyper-v
I'm using NServiceBus to handle some calculation messages. I have a new requirement to handle calculation errors by writing them the same database. I'm using NHibernate as my DAL which auto enlists to the NServiceBus transaction and provides rollback in case of exceptions, which is working really well. However if I write this particular error to the database, it is also rolled back which is a problem.
I knew this would be a problem, but I thought I could just wrap the call in a new transaction with the TransactionScopeOption = Suppress. However the error data is still rolled back. I believe that's because it was using the existing session with has already enlisted in the NServiceBus transaction.
Next I tried opening a new session from the existing SessionFactory within the suppression transaction scope. However the first call to the database to retrieve or save data using this new session blocks and then times out.
InnerException: System.Data.SqlClient.SqlException
Message=Timeout expired. The timeout period elapsed prior to completion of the >operation or the server is not responding.
Finally I tried creating a new SessionFactory using it to open a new session within the suppression transaction scope. However again it blocks and times out.
I feel like I'm missing something obvious here, and would greatly appreciate any suggestions on this probably common task.
As Adam suggests in the comments, in most cases it is preferred to let the entire message fail processing, giving the built-in Retry mechanism a chance to get it right, and eventually going to the error queue. Then another process can monitor the error queue and do any required notification, including logging to a database.
However, there are some use cases where the entire message is not a failure, i.e. on the whole, it "succeeds" (whatever the business-dependent definition of that is) but there is some small part that is in error. For example, a financial calculation in which the processing "succeeds" but some human element of the data is "in error". In this case I would suggest catching that exception and sending a new message which, when processed by another endpoint, will log the information to your database.
I could see another case where you want the entire message to fail, but you want the fact that it was attempted noted somehow. This may be closest to what you are describing. In this case, create a new TransactionScope with TransactionScopeOption = Suppress, and then (again) send a new message inside that scope. That message will be sent whether or not your full message transaction rolls back.
You are correct that your transaction is rolling back because the NHibernate session is opened while the transaction is in force. Trying to open a new session inside the suppressed transaction can cause a problem with locking. That's why, most of the time, sending a new message asynchronously is part of the solution in these cases, but how you do it is dependent upon your specific business requirements.
I know I'm late to the party, but as an alternative suggestion, you coudl simply raise another separate log message, which NSB handles independently, for example:
public void Handle(DebitAccountMessage message)
{
var account = this.dbcontext.GetById(message.Id);
if (account.Balance <= 0)
{
// log request - new handler
this.Bus.Send(new DebitAccountLogMessage
{
originalMessage = message,
account = account,
timeStamp = DateTime.UtcNow
});
// throw error - NSB will handle
throw new DebitException("Not enough funds");
}
}
public void Handle(DebitAccountLogMessage message)
{
var messageString = message.originalMessage.Dump();
var accountString = message.account.Dump(DumpOptions.SuppressSecurityTokens);
this.Logger.Log(message.UniqueId, string.Format("{0}, {1}", messageString, accountString);
}
I've been working on this for about a day and a half now, and searched numberous blogs and help articles on the Web. I found several questions on SO related to this error, but I didn't think they quite applied to my situation (or in some cases, unfortunately, I couldn't understand them well enough to implement :P). I'm not sure I can describe this well enough for help... but here goes:
We have a .NET app to track our resources. There's an export function to copy a resource to the time tracking system and the billing system; this accesses a stored procedure that links to the time and billing databases.
I recently moved the billing system database to a new server (original server: Server 2003 SP2, SQL 2005; new server: Server 2008 R2, SQL 2008 R2). I have a Linked Server set up that points to the 2008 databases. I updated the stored procedure to point to the 2008 server, and then I got an error about MSDTC and RPC (http://www.safnet.com/writing/tech/archives/2007/06/server_myserver.html). I enabled 'rpc/rpc out' on the Linked Server and set MSDTC to allow Network Access (something like this: http://www.sqlwebpedia.com/content/msdtc-troubleshooting).
Now I'm getting the above, when I try to run the export function: "This SqlTransaction has completed; it is no longer usable." What seems odd to me is that when I just run the stored procedure (from SSMS), it says it completes successfully.
Has anyone seen this before? Have I missed something in the configuration? I keep going over the same pages, and the only thing I found was that I didn't reboot after making the MSDTC changes (mentioned in here: http://social.msdn.microsoft.com/forums/en-US/adodotnetdataproviders/thread/7172223f-acbe-4472-8cdf-feec80fd2e64/).
I can post part or all of the stored procedure, if it would help... please let me know.
I believe this error message is due to a "zombie transaction".
Look for possible areas where the transacton is being committed twice (or rolled back twice, or rolled back and committed, etc.). Does the .Net code commit the transaction after the SP has already committed it? Does the .Net code roll it back on encountering an error, then attempt to roll it back again in a catch (or finally) clause?
It's possible an error condition was never being hit on the old server, and thus the faulty "double rollback" code was never hit. Maybe now you have a situation where there is some configuration error on the new server, and now the faulty code is getting hit via exception handling.
Can you debug into the error code? Do you have a stack trace?
I had this recently after refactoring in a new connection manager. A new routine accepted a transaction so it could be run as part of a batch, problem was with a using block:
public IEnumerable<T> Query<T>(IDbTransaction transaction, string command, dynamic param = null)
{
using (transaction.Connection)
{
using (transaction)
{
return transaction.Connection.Query<T>(command, new DynamicParameters(param), transaction, commandType: CommandType.StoredProcedure);
}
}
}
It looks as though the outer using was closing the underlying connection thus any attempts to commit or rollback the transaction threw up the message "This SqlTransaction has completed; it is no longer usable."
I removed the usings added a covering test and the problem went away.
public IEnumerable<T> Query<T>(IDbTransaction transaction, string command, dynamic param = null)
{
return transaction.Connection.Query<T>(command, new DynamicParameters(param), transaction, commandType: CommandType.StoredProcedure);
}
Check for anything that might be closing the connection while inside the context of a transaction.
Had the exact same problem and just could not find the right solution.
Hope this helps somebody.
I have an .NET Core 3.1 WebApi with EF Core. Upon receiving multiple calls at the same time, the applications was trying to add and save changes to the database at the same time.
In my case the problem was that the table that the data would be saved in, did not have a primary key set.
Somehow EF Core missed when the migration was ran from the application that the ID in the model was supposed to be a primary key.
I found the problem by opening the SQL Profiler and seeing that all transactions was successfully submitted to the database (from the application) but only one new row was created. The profiler also showed that some type of deadlock was happening but I couldn't see much more in the trace logs of the profiler.
On further inspection I noticed that the primary key identifier was missing on the column "Id".
The exceptions I got from my application was:
This SqlTransaction has
completed; it is no longer usable.
and/or
An exception has been raised that is likely due to a transient
failure. Consider enabling transient error resiliency by adding
'EnableRetryOnFailure()' to the 'UseSqlServer' call.
I have the same problem. This error occurs because conection pooling. When exists two or more users acess the system the connetion pooling reuse a connetion and the transation too. If the first user execute commit ou rollback the transaction is no longe usable.
I have recently ran across similar situation. To debug in any VS IDE version, open exceptions from Debug (Ctrl + D, E) - check all checkboxes against the column "Thrown", and run the application in debug mode. I have realized that one of the tables was not imported properly in the new database, so internal Sql Exception was killing the connection, thus results into this error.
Gist of the story is, If Previously working code returns this error on a new database, this could be database schema missing issue, realize by above debugging tip,
Hope It Helps,
HydTechie
Also check for any long running processes executed from your .NET app against the DB. For example you may be calling a stored procedure or query which does not have enough time to finish which can show in your logs as:
Execution Timeout Expired. The timeout period elapsed prior to
completion of the operation or the server is not responding.
This SqlTransaction has completed; it is no longer usable.
Check the command timeout settings
Try to run a trace (profiler) and see what is happening on the DB side...
In my case the problem was that one of the queries included in the transaction was raising an exception, and even though the exception was "gracefully" handled, it still managed to roll back the entire transaction.
My pseudo-code was like:
var transaction = connection.BeginTransaction();
for(all the lines in a file)
{
try{
InsertLineInTable(); // INSERT statement might fail and throw an exception
}
catch {
// notify the user about the error on line x and continue
}
}
// Commit and Rollback will fail if one of the queries
// in InsertLineInTable threw an exception
if(CheckTableForErrors())
{
transaction.Commit();
}
else
{
transaction.Rollback();
}
Here is a way to detect Zombie transaction
SqlTransaction trans = connection.BeginTransaction();
//some db calls here
if (trans.Connection != null) //Detecting zombie transaction
{
trans.Commit();
}
Decompiling the SqlTransaction class, you will see the following
public SqlConnection Connection
{
get
{
if (this.IsZombied)
return (SqlConnection) null;
return this._connection;
}
}
I notice if the connection is closed, the transOP will become zombie, thus cannot Commit.
For my case, it is because I have the Commit() inside a finally block, while the connection was in the try block. This arrangement is causing the connection to be disposed and garbage collected. The solution was to put Commit inside the try block instead.
For what it's worth, I've run into this on what was previously working code. I had added SELECT statements in a trigger for debug testing and forgot to remove them. Entity Framework / MVC doesnt play nice when other stuff is output to the "grid". Make sure to check for any rogue queries and remove them.
In my case, I've some codes which need to execute after committing the transaction, at the same try-catch block. One of the codes threw
an error then try block handed over the error to its catch block which contains the transaction rollback.
It will show a similar error. For example, look at the code structure below :
SqlTransaction trans = null;
try{
trans = Con.BeginTransaction();
// your codes
trans.Commit();
//your codes having errors
}
catch(Exception ex)
{
trans.Rollback(); //transaction roll back
// error message
}
finally
{
// connection close
}
Hope it will help someone :)