Hangfire - Rescheduling Scheduled jobs - hangfire

To reschedule scheduled jobs I am currently deleting the previous job and then scheduling a new one within a transaction so if the schedule fails it rolls back the delete:
using (var transaction = new TransactionScope(TransactionScopeOption.Required))
{
BackgroundJob.Delete(jobId);
BackgroundJob.Schedule(...);
}
This involves keeping storing the original arguments somewhere so I can reschedule with the original args.
I was wondering whether there was any ramifications to doing the following instead:
new BackgroundJobClient().ChangeState(jobId, new ScheduledState(timespan), ScheduledState.StateName);
This way there is no need to keep the original job arguments around and also is transactional.

Related

Triggering a scheduled job from multiple hangfire servers

we are triggering a scheduled job in a configured interval and it is getting fired. This is working in a single hangfire server environment.
In some of our environment, we will be having more than one servers. So, at this scenario, all the servers will trigger that job at that particular interval. Can we restrict this job to be triggered from only one server?
string cronExpressionCleanupJob = "0 0 0/{2} ? * *";
RecurringJob.AddOrUpdate<CleanUpJobTriggerJob>(nameof(CleanUpJobTriggerJob),
job => job.ExecuteJob(null, null), cronExpressionCleanupJob, TimeZoneInfo.Local);
var hangFireJobId = BackgroundJob.Enqueue<CleanUpJobTriggerJob>(x => x.ExecuteJob(null, null));
According to the documentation, you should not worry as long as you specify the same identifier for your recurring job.
The call to AddOrUpdate method will create a new recurring job or
update existing job with the same identifier.

Locking database rows

I have a table in my database with defined jobs. One of jobs attribute is status, which can be [waiting, in_progress, done]. To process jobs i have defined master-worker relation between two servers, they work in the following way:
master looks for the first record in database with status 'waiting' and triggers a worker server to process the job.
Triggered worker sets status on the job to 'in_progress' in one transaction and starts executing the job (in the meantime server is looking for next job with status 'waiting' and triggers another worker)
when the job is done worker sets status on the job to 'done'
This is the perfect case scenario, however it might happen that during job execution one of the workers dies. In this case job needs to be restarted, however master has no way of verifying if job was done other than checking its status in database ('done'). Therefore if worker dies, in database the job has still status 'in_progress' and server has no idea that worker died and that job needs to be restarted. (we can't get any feedback from worker, we can not ping it and we can not get information what job is he currently working on)
My idea to solve this problem would be:
After worker changed job's status to 'in_progress' (transaction committed) he would open a new transaction with lock on the particular job.
Master, while looking for jobs to start would look for both 'waiting' and 'in_progress' which are not locked
If worker dies, transaction would break, releasing the lock from job record in database and master would reprocess it.
now i'm looking for a way to verify it this would indeed work. Possibly with SQL sctipt in SQL developer (two instances), i would like this to work in the following way:
Instance 1:
open transaction and create a row lock on row_1
sleep for 5 min
release lock
Instance 2:
Open transaction and look for row matching criteria (row_1 and row_2 match criteria)
return selected row
then i would kill instance 1 to simulate worker death and run instance 2 again.
Please advise if my approach is correct
P.S
Also could you point me to some good explanation how can i create script form instance 1?

Does oracle database start new job (from 'Scheduler') before last job (this same) is finished?

What happens if oracle database starts job(from 'Scheduler') before last job(this same) finishes? Does oracle add it to a stack or finally stops?
Oracle is smart enough to know not to start a new job instance before the previous job is finished.
From the Oracle docs:
http://docs.oracle.com/cd/B19306_01/server.102/b14231/scheduse.htm
Setting the Repeat Interval
...
Immediately after a job is started, the repeat_interval is evaluated to determine the next scheduled execution time of the job. It is possible that the next scheduled execution time arrives while the job is still running. A new instance of the job, however, will not be started until the current one completes.

Handle NHibernate Transaction Errors

Our application (which uses NHibernate and ASP.NET MVC), when put under stress tests throws a lot of NHibernate transaction errors. The major types are:
Transaction not connected, or was disconnected
Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect)
Transaction (Process ID 177) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Can someone help me in identifying the reason for Exception 1?
I know I have to handle the other exceptions in my code. Can someone point me to resources which can help me handle these errors in an efficient manner?
Q. How do we manage Sessions and Transactions?
A. We are using Autofac. For every server request, we create a new request container which has the session in the container lifetime scope. On activating the session we begin the transaction. When the request completes, we commit the transaction. In some cases, the transaction can be huge. To simplify, every server request is contained in a transaction.
Have a look at this thread:
http://n2cms.codeplex.com/Thread/View.aspx?ThreadId=85016
Basically what it says as a possible cause of this exception:
2010-02-17 21:01:41,204 1 WARN
NHibernate.Util.ADOExceptionReporter -
System.Data.SqlClient.SqlException:
The transaction log for database
'databasename' is full. To find out
why space in the log cannot be reused,
see the log_reuse_wait_desc column in
sys.databases
As the transaction log's size is proportional to the amount of work done during the transaction, perhaps you ought to look into putting your transactional boundaries across command handlers 'handling' of commands on the write-part of transactions. You would then, with a session#X, load the state you wish to mutate, mutate it and commit it, all as one unit of work in #X.
With regards to the read-side of things, you might then have another ISession#Y that reads data; this ISession could be used to batch reads within e.g. RepeatableRead or something similar with the Futures feature and could simply be reading from a cache (albiet it being a crutch indeed). Doing it this way might help you recover from "errors" that aren't; livelocks, deadlocks and victim transactions.
The problem with using a transaction per request is that your ISession acquires a lot of book keeping data while you are working, all of which is part of the transaction. Hence the database marks the datas (rols, cols, tables, etc) as partaking in the transaction, causing the wait-graph to span 'entities' (in the database-sense, not the DDD-sense), which are not actually part of the transactional boundary of the command your application took.
For the record (other people googling this), Fabio had a post dealing with dealing with exceptions from the data layer. Quoting some of his code;
public class MsSqlExceptionConverterExample : ISQLExceptionConverter
{
public Exception Convert(AdoExceptionContextInfo exInfo)
{
var sqle = ADOExceptionHelper.ExtractDbException(exInfo.SqlException) as SqlException;
if(sqle != null)
{
switch (sqle.Number)
{
case 547:
return new ConstraintViolationException(exInfo.Message,
sqle.InnerException, exInfo.Sql, null);
case 208:
return new SQLGrammarException(exInfo.Message,
sqle.InnerException, exInfo.Sql);
case 3960:
return new StaleObjectStateException(exInfo.EntityName, exInfo.EntityId);
}
}
return SQLStateConverter.HandledNonSpecificException(exInfo.SqlException,
exInfo.Message, exInfo.Sql);
}
}
547 is the exception number for constraint conflict.
208 is the exception number for an invalid object name in the SQL.
3960 is the exception number for Snapshot isolation transaction aborted due to update conflict.
So if you are running into concurrency issues like what you describe; remember that they will invalidate your ISession and that you'd have to handle them like the above.
Part of what you might be looking for is CQRS, where you have separate read and write-sides. This might help: http://abdullin.com/cqrs/, http://cqrsinfo.com.
So to summarize; your problems might be related to the way your handle your transactions. Also, try running select log_wait_reuse_desc from sys.databases where name='MyDBName' and see what it gives you.
This thread has an explanation:
http://groups.google.com/group/nhusers/browse_thread/thread/7f5fb68a00829d13
In short, the database probably rolls back the transaction by itself due to some error, so that when you try to rollback the transaction later it is already rolled back and in a zombie state. This tends to hide the actual reason for the rollback since all you see is a TransactionException instead of the exception that actually triggered the rollback in the first place.
I don't think there is much you can do about it beyond logging it and trying to figure out what is causing the underlying error.
I know this post was a while back and assume you fixed it, but seems like you have thread sharing issues with the NHibernate ISession which is not threadsafe. Basically 1 thread is starting a transaction and another is attempting to close it causing all sorts of chaos.

Asynchronous Triggers in SQL Server 2005/2008

I have triggers that manipulate and insert a lot of data into a Change tracking table for audit purposes on every insert, update and delete.
This trigger does its job very well, by using it we are able to log the desired oldvalues/newvalues as per the business requirements for every transaction.
However in some cases where the source table has a lot columns, it can take up to 30 seconds for the transaction to complete which is unacceptable.
Is there a way to make the trigger run asynchronously? Any examples.
You can't make the trigger run asynchronously, but you could have the trigger synchronously send a message to a SQL Service Broker queue. The queue can then be processed asynchronously by a stored procedure.
these articles show how to use service broker for async auditing and should be useful:
Centralized Asynchronous Auditing with Service Broker
Service Broker goodies: Cross Server Many to One (One to Many) scenario and How to troubleshoot it
SQL Server 2014 introduced a very interesting feature called Delayed Durability. If you can tolerate loosing a few rows in case of an catastrophic event, like a server crash, you could really boost your performance in schenarios like yours.
Delayed transaction durability is accomplished using asynchronous log
writes to disk. Transaction log records are kept in a buffer and
written to disk when the buffer fills or a buffer flushing event takes
place. Delayed transaction durability reduces both latency and
contention within the system
The database containing the table must first be altered to allow delayed durability.
ALTER DATABASE dbname SET DELAYED_DURABILITY = ALLOWED
Then you could control the durability on a per-transaction basis.
begin tran
insert into ChangeTrackingTable select * from inserted
commit with(DELAYED_DURABILITY=ON)
The transaction will be commited as durable if the transaction is cross-database, so this will only work if your audit table is located in the same database as the trigger.
There is also a possibility to alter the database as forced instead of allowed. This causes all transactions in the database to become delayed durable.
ALTER DATABASE dbname SET DELAYED_DURABILITY = FORCED
For delayed durability, there is no difference between an unexpected
shutdown and an expected shutdown/restart of SQL Server. Like
catastrophic events, you should plan for data loss. In a planned
shutdown/restart some transactions that have not been written to disk
may first be saved to disk, but you should not plan on it. Plan as
though a shutdown/restart, whether planned or unplanned, loses the
data the same as a catastrophic event.
This strange defect will hopefully be addressed in a future release, but until then it may be wise to make sure to automatically execute the 'sp_flush_log' procedure when SQL server is restarting or shutting down.
To perform asynchronous processing you can use Service Broker, but it isn't the only option, you can also use CLR objects.
The following is an example of an stored procedure (AsyncProcedure) that asynchronous calls another procedure (SyncProcedure):
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Runtime.Remoting.Messaging;
using System.Diagnostics;
public delegate void AsyncMethodCaller(string data, string server, string dbName);
public partial class StoredProcedures
{
[Microsoft.SqlServer.Server.SqlProcedure]
public static void AsyncProcedure(SqlXml data)
{
AsyncMethodCaller methodCaller = new AsyncMethodCaller(ExecuteAsync);
string server = null;
string dbName = null;
using (SqlConnection cn = new SqlConnection("context connection=true"))
using (SqlCommand cmd = new SqlCommand("SELECT ##SERVERNAME AS [Server], DB_NAME() AS DbName", cn))
{
cn.Open();
using (SqlDataReader reader = cmd.ExecuteReader())
{
reader.Read();
server = reader.GetString(0);
dbName = reader.GetString(1);
}
}
methodCaller.BeginInvoke(data.Value, server, dbName, new AsyncCallback(Callback), null);
//methodCaller.BeginInvoke(data.Value, server, dbName, null, null);
}
private static void ExecuteAsync(string data, string server, string dbName)
{
string connectionString = string.Format("Data Source={0};Initial Catalog={1};Integrated Security=SSPI", server, dbName);
using (SqlConnection cn = new SqlConnection(connectionString))
using (SqlCommand cmd = new SqlCommand("SyncProcedure", cn))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add("#data", SqlDbType.Xml).Value = data;
cn.Open();
cmd.ExecuteNonQuery();
}
}
private static void Callback(IAsyncResult ar)
{
AsyncResult result = (AsyncResult)ar;
AsyncMethodCaller caller = (AsyncMethodCaller)result.AsyncDelegate;
try
{
caller.EndInvoke(ar);
}
catch (Exception ex)
{
// handle the exception
//Debug.WriteLine(ex.ToString());
}
}
}
It uses asynchronous delegates to call SyncProcedure:
CREATE PROCEDURE SyncProcedure(#data xml)
AS
INSERT INTO T(Data) VALUES (#data)
Example of calling AsyncProcedure:
EXEC dbo.AsyncProcedure N'<doc><id>1</id></doc>'
Unfortunatelly, the assembly requires UNSAFE permission.
I wonder if you could tag a record for the change tracking by inserting into a "too process" table including who did the change etc etc.
Then another process could come along and copy the rest of the data on a regular basis.
There's a basic conflict between "does its job very well" and "unacceptable", obviously.
It sounds to me that you're trying to use triggers the same way you would use events in an OO procedural application, which IMHO doesn't map.
I would call any trigger logic that takes 30 seconds - no, more that 0.1 second - as disfunctional. I think you really need to redesign your functionality and do it some other way. I'd say "if you want to make it asynchronous", but I don't think this design makes sense in any form.
As far as "asynchronous triggers", the basic fundamental conflict is that you could never include such a thing between BEGIN TRAN and COMMIT TRAN statements because you've lost track of whether it succeeded or not.
Create history table(s). While updating (/deleting/inserting) main table, insert old values of record (deleted pseudo-table in trigger) into history table; some additional info is needed too (timestamp, operation type, maybe user context). New values are kept in live table anyway.
This way triggers run fast(er) and you can shift slow operations to log viewer (procedure).
From sql server 2008 you can use CDC feature for automatically logging changes, which is purely asynchronous. Find more details in here
Not that I know of, but are you inserting values into the Audit table that also exist in the base table? If so, you could consider tracking just the changes. Therefore an insert would track the change time, user, extra and a bunch of NULLs (in effect the before value). An update would have the change time, user etc and the before value of the changed column only. A delete has the change at, etc and all values.
Also, do you have an audit table per base table or one audit table for the DB? Of course the later can more easily result in waits as each transaction tries to write to the one table.
I suspect that your trigger is of of these generic csv/text generating triggers designed to log all changes for all table in one place. Good in theory (perhaps...), but difficult to maintain and use in practice.
If you could run asynchronously (which would still require storing data somewhere for logging again later), then you are not auditing and neither do have history to use.
Perhaps you could look at the trigger execution plan and see what bit is taking the longest?
Can you change how you audit, say, to per table? You could split the current log data into the relevant tables.